[ 
https://issues.apache.org/jira/browse/NUTCH-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978826#comment-14978826
 ] 

Sujen Shah commented on NUTCH-2153:
-----------------------------------

Hi [~ahmadia] and [~chrismattmann], 

Currently, while using Nutch REST services in local mode, the crawldb job gets 
executed pretty fast. But if the same is used in a distributed mode, the 
crawldb job can take up a fair amount of time. So issuing a GET request would 
make the client wait for a long time for the response. 
A POST request was used since the crawldb resource is created once a user 
issues a request and not precomputed (which is usually the case when a GET is 
used). The /db endpoint still requires development in the part where it can 
spin up threads for computation like the /job endpoint, and then provide a GET 
interface to query results.

I have tried to use the same concept in the commoncrawldump service as that 
might also take up time as the amount of data crawled increases. 

I would like to know what are your thoughts to handle such cases, where issuing 
a GET requires computation of the resource. 

Thanks!

> Nutch REST API (DB) uses POST instead of GET to request
> -------------------------------------------------------
>
>                 Key: NUTCH-2153
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2153
>             Project: Nutch
>          Issue Type: Bug
>          Components: REST_api
>    Affects Versions: 1.11
>            Reporter: Aron Ahmadia
>            Priority: Trivial
>              Labels: memex
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to