[
https://issues.apache.org/jira/browse/NUTCH-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978826#comment-14978826
]
Sujen Shah commented on NUTCH-2153:
-----------------------------------
Hi [~ahmadia] and [~chrismattmann],
Currently, while using Nutch REST services in local mode, the crawldb job gets
executed pretty fast. But if the same is used in a distributed mode, the
crawldb job can take up a fair amount of time. So issuing a GET request would
make the client wait for a long time for the response.
A POST request was used since the crawldb resource is created once a user
issues a request and not precomputed (which is usually the case when a GET is
used). The /db endpoint still requires development in the part where it can
spin up threads for computation like the /job endpoint, and then provide a GET
interface to query results.
I have tried to use the same concept in the commoncrawldump service as that
might also take up time as the amount of data crawled increases.
I would like to know what are your thoughts to handle such cases, where issuing
a GET requires computation of the resource.
Thanks!
> Nutch REST API (DB) uses POST instead of GET to request
> -------------------------------------------------------
>
> Key: NUTCH-2153
> URL: https://issues.apache.org/jira/browse/NUTCH-2153
> Project: Nutch
> Issue Type: Bug
> Components: REST_api
> Affects Versions: 1.11
> Reporter: Aron Ahmadia
> Priority: Trivial
> Labels: memex
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)