[jira] [Comment Edited] (NUTCH-1823) Upgrade to elasticsearch 1.4.1

2014-12-10 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14240848#comment-14240848 ] Julien Nioche edited comment on NUTCH-1823 at 12/10/14 9:25 AM:

[jira] [Commented] (NUTCH-1823) Upgrade to elasticsearch 1.4.1

2014-12-10 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14240848#comment-14240848 ] Julien Nioche commented on NUTCH-1823: -- I'd like to review the patch but have quite a

Re: Already subscribed to dev@nutch.apache.org

2014-12-10 Thread Tizy Ninan
Hi, I am trying to develop a custom crawler to crawl websites that require form based authentication using Nutch v1.9 in Java. The HttpPostAuthentication feature of Nutch is followed to implement it. The login parameters required for authentication such as html form-id, login post

Not able to crawl a website using Nutch

2014-12-10 Thread Thalatam, Venkata naveen
Hello, Can someone assist me with the below error I am trying to crawl a website within organization and unsuccessful using nutch [cid:image002.png@01D0149E.2CC39D70] Best Regards, Venkata Naveen Thalatam (Naveen)[Description: Description:

RE: Not able to crawl a website using Nutch

2014-12-10 Thread Thalatam, Venkata naveen
[cid:image004.png@01D0149E.63B21D70] From: Thalatam, Venkata naveen Sent: Wednesday, December 10, 2014 5:25 PM To: dev@nutch.apache.org Subject: Not able to crawl a website using Nutch Importance: High Hello, Can someone assist me with the below error I am trying to crawl a website within

[jira] [Updated] (NUTCH-1823) Upgrade to elasticsearch 1.4.1

2014-12-10 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1823: - Attachment: NUTCH-1823-trunk.patch trunk has a slightly different ivy.xml, here's a patch that

Re: Not able to crawl a website using Nutch

2014-12-10 Thread feng lu
Hi Thalatam You can check this tutorial to get how to use Nutch command line interface. http://wiki.apache.org/nutch/NutchTutorial bin/nutch crawl was deprecated, you can use bin/crawl command instead. On Wed, Dec 10, 2014 at 7:56 PM, Thalatam, Venkata naveen

HttpPostAuthentication

2014-12-10 Thread Tizy Ninan
Hi, I am trying to develop a custom crawler to crawl websites that require form based authentication using Nutch v1.9 in Java. The HttpPostAuthentication feature of Nutch is followed to implement it. The login parameters required for authentication such as html form-id, login post

[jira] [Created] (NUTCH-1896) SolrDeleteDuplicates does not use the mapped Solr field names from solrindex-mapping.xml

2014-12-10 Thread Brian (JIRA)
Brian created NUTCH-1896: Summary: SolrDeleteDuplicates does not use the mapped Solr field names from solrindex-mapping.xml Key: NUTCH-1896 URL: https://issues.apache.org/jira/browse/NUTCH-1896 Project:

[jira] [Updated] (NUTCH-1896) SolrDeleteDuplicates does not use the mapped Solr field names from solrindex-mapping.xml

2014-12-10 Thread Brian (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian updated NUTCH-1896: - Description: SolrDeleteDuplicates uses the hard-coded field names specified in SolrConstants.java to get all the

[jira] [Updated] (NUTCH-1896) SolrDeleteDuplicates does not use the mapped Solr field names from solrindex-mapping.xml

2014-12-10 Thread Brian (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian updated NUTCH-1896: - Description: SolrDeleteDuplicates uses the hard-coded field names specified in SolrConstants.java to get all the

[jira] [Commented] (NUTCH-1823) Upgrade to elasticsearch 1.4.1

2014-12-10 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14241341#comment-14241341 ] Lewis John McGibbney commented on NUTCH-1823: - np [~jnioche], going to work my

[jira] [Commented] (NUTCH-1895) run() method in Crawler.java doesnt put Nutch.ARG_BATCH in argMap

2014-12-10 Thread FeiTian (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14241965#comment-14241965 ] FeiTian commented on NUTCH-1895: Hi JIRA, Thanks for your reply. The script