Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The "Nutch_1.X_RESTAPI/RunningJobsTutorial" page has been changed by SujenShah:
https://wiki.apache.org/nutch/Nutch_1.X_RESTAPI/RunningJobsTutorial?action=diff&rev1=4&rev2=5

  2. :~$ bin/nutch startserver -port <port_number> -host <host_name> [If the 
host/port option is not specified then by default the server starts on 
localhost:8081]
  
  == Jobs ==
- Currently the service supports the running of the following jobs - Inject, 
Generate, Fetch, Parse, Updatedb, Invertlinks, Dedup and Readdb.
+ Currently the service supports the running of the following jobs - Inject, 
Generate, Fetch, Parse, Index, Updatedb, Invertlinks, Dedup and Readdb.
  Any new job can be created by issuing a POST request to /job/create with 
following JSON data 
  {{{{
  POST /job/create
@@ -80, +80 @@

  }}}}
  
  === Fetch Job ===
- To run the generate job call POST /job/create with following
+ To run the fetch job call POST /job/create with following
  {{{{
  POST /job/create
  {  
@@ -109, +109 @@

  }}}}
  
  === Parse Job ===
- To run the generate job call POST /job/create with following
+ To run the parse job call POST /job/create with following
  {{{{
  POST /job/create
  {  
@@ -137, +137 @@

  }
  }}}}
  
+ === Index Job ===
+ To run the index job call POST /job/create with following
+ {{{{
+ POST /job/create
+ {  
+     "type":"INDEX",
+     "confId":"new-config",
+     "crawlId":"crawl01",
+     "args": {}
+ }
+ }}}}
+ 
+ Before running the index job, the user needs to configure an indexer. User 
defined index like (Solr, Elasticsearch) can be configured by using the 
configuration end point.
+ A detailed description of how to configure and run the index job can be found 
at 
[[https://wiki.apache.org/nutch/Nutch_1.X_RESTAPI/RunningJobsTutorial/IndexJob|here]].
+ 
+ The args contain keys - crawldb, linkdb, params, dir, segements, noCommit, 
deleteGone, filter, normalize
+ 
+ The response of the request in a JSON output
+ {{{{
+ {
+     "confId":"new-config",
+     "args":{},
+     "crawlId":"crawl01",
+     "msg":"OK",
+     "id":"default-INDEX-572647647",
+     "state":"RUNNING",
+     "type":"INDEX",
+     "result":null
+ }
+ }}}}
+ 
+ 
  === Updatedb Job ===
- To run the generate job call POST /job/create with following
+ To run the updatedb job call POST /job/create with following
  {{{{
  POST /job/create
  {  
@@ -167, +199 @@

  }}}}
  
  === Invertlinks Job ===
- To run the generate job call POST /job/create with following
+ To run the invertlinks job call POST /job/create with following
  {{{{
  POST /job/create
  {  
@@ -198, +230 @@

  
  
  === Dedup Job ===
- To run the generate job call POST /job/create with following
+ To run the dedup job call POST /job/create with following
  {{{{
  POST /job/create
  {  

Reply via email to