[
https://issues.apache.org/jira/browse/NUTCH-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833139#comment-15833139
]
ASF GitHub Bot commented on NUTCH-2353:
---------------------------------------
GitHub user jorgelbg opened a pull request:
https://github.com/apache/nutch/pull/175
Fix for NUTCH-2353 contributed by jorgelbg
This PR adds the possibility of adding metadata to the seed file created
using the REST API as showed on
https://issues.apache.org/jira/browse/NUTCH-2353, the supported syntax is:
```json
{
"name":"name-of-seedlist",
"seedUrls":[
{
"url" : "http://example.com",
"metadata" : {
"key1" : "value1",
"key2" : "value2",
"key3" : "value3"
}
}
]
}
```
Also the old syntax is supported:
```json
{
"name":"name-of-seedlist",
"seedUrls":["http://www.example.com", "http://www.example1.com"]
}
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jorgelbg/nutch NUTCH-2353
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/nutch/pull/175.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #175
----
commit 7deb576bc58bb74725cbb6c5d82d7b9244c6ad42
Author: Jorge Luis Betancourt <[email protected]>
Date: 2017-01-20T21:13:45Z
Fix for NUTCH-2353 contributed by jorgelbg
----
> Create seed file with metadata using the REST API
> -------------------------------------------------
>
> Key: NUTCH-2353
> URL: https://issues.apache.org/jira/browse/NUTCH-2353
> Project: Nutch
> Issue Type: Improvement
> Components: injector, REST_api
> Affects Versions: 1.12
> Reporter: Jorge Luis Betancourt Gonzalez
> Assignee: Jorge Luis Betancourt Gonzalez
> Priority: Minor
> Labels: rest_api
> Fix For: 1.13
>
>
> At the moment its not possible to create a seed file and specify any metadata
> when using the REST API. The file gets created but there is no option to add
> any metadata to the seed URLs.
> If we use a payload like this:
> {code}
> {
> "name":"name-of-seedlist",
> "seedUrls":[
> {
> "url" : "http://example.com",
> "metadata" : {
> "key1" : "value1",
> "key2" : "value2",
> "key3" : "value3"
> }
> }
> ]
> }
> {code}
> It should be easy to specify the desired metadata. Also this should keep BC
> with the previous array syntax if we only want to specify the list of URLs
> without any metadata at all.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)