Re: trouble using nutch server

2015-04-07 Thread Sujen Shah
Hi, It seems like you are using Nutch 2.x. And the args you passed looks like the one from the documentation of the Nutch 1.x REST service. Could you please tell which documentation did you refer to ? Regards, Sujen Shah M.S - Computer Science (Class of 2016) University of Southern California +1

Re: trouble using nutch server

2015-04-08 Thread Sujen Shah
.../apache-nutch-2.3/runtime/local/url/"} } I do not know about the documentation of 2.x Sorry for the late reply. Hope it helps :) Regards, Sujen Shah M.S - Computer Science (Class of 2016) University of Southern California +1(213)-820-9169 http://www.linkedin.com/in/sujenshah On Tue, Apr

Generate separate fetchlist by host

2015-06-15 Thread Sujen Shah
g multiple fetch partitions and segments. And generate.max.count, generate.count.mode allow some configurations. But I did not understand if it is possible to generate multiple fetchlists (I am currently working in a local mode) Thank you. Regards, Sujen Shah M.S - Computer Science (Class of 2016)

Re: [DISCUSS] Release Nutch trunk 1.11

2015-08-26 Thread Sujen Shah
Hey Chris, I'm working on it as I write this mail. Testing out the services as there are minor changes in the REST endpoints between 2x and 1x. Will post an update by 2pm PT. Regards, Sujen Shah. On Aug 26, 2015 7:09 AM, "Mattmann, Chris A (3980)" < chris.a.mattm...@jpl.nasa.go

Re: [ANNOUNCE] New Nutch committer and PMC - Sujen Shah

2015-09-15 Thread Sujen Shah
and Lewis McGibbney at school and at NASA JPL for quite some time now and have been involved in developing the Nutch REST services, focusing capabilities and currently working on the Apache Wicket based Web UI. Looking forward to get engaged with the community even more :) Thanks, Sujen Shah M.S

SVN-GIT mirror not updated for Revision 1705744

2015-09-30 Thread Sujen Shah
Hi All, The recent commit with the Nutch 1x webui is not mirrored on the github repository, the commit exists in svn trunk. Have filed an INFRA ticket - https://issues.apache.org/jira/browse/INFRA-10515. Regards, Sujen Shah M.S - Computer Science (Class of 2016) University of Southern

Re: Request for inclusion in the Nutch email list

2015-10-01 Thread Sujen Shah
Hi Pramod, To subscribe to the list you need to send a mail to dev-subscr...@nutch.apache.org. For more instructions have a look at - http://nutch.apache.org/mailing_lists.html Cheers, Sujen Shah On Tue, Sep 29, 2015 at 10:22 PM, Pramod Nagarajarao wrote: > Hello Team, > > I'm P

Re: Team 18 : Similarity scoring: goldstandard.txt, stopwords.txt contents

2015-10-07 Thread Sujen Shah
example soon. Best, Sujen Regards, Sujen Shah M.S - Computer Science (Class of 2016) University of Southern California http://www.linkedin.com/in/sujenshah On Wed, Oct 7, 2015 at 6:52 AM, Christian Alan Mattmann wrote: > Sujen can you provide an example on the existing Scoring > Similarit

Re: [MASSMAIL][VOTE] Release Apache Nutch 1.11 RC#2

2015-12-04 Thread Sujen Shah
+1 Regards, Sujen Shah M.S - Computer Science (Class of 2016) University of Southern California http://www.linkedin.com/in/sujenshah On Fri, Dec 4, 2015 at 10:20 AM, Roannel Fernández Hernández wrote: > +1 > > Regards > > -- > > *De: *"Lewi

Re: [VOTE] Moving to Git

2016-01-08 Thread Sujen Shah
+1 Regards, Sujen Shah M.S - Computer Science (Class of 2016) University of Southern California http://www.linkedin.com/in/sujenshah On Fri, Jan 8, 2016 at 2:58 PM, Julien Nioche wrote: > +1 to move to Git > > Note : I don't think Dennis is on the PMC anymore > > Ju > &g

Re: [selenium] running selenium headless

2016-03-28 Thread Sujen Shah
Hi Can't get much info from the log you have pasted. Some Qs: Which browser are you using ? Have you tried running the browser alone on the server before running nutch ? Could you please attach the detailed logs from hadoop.log file ? Thanks. Regards, Sujen Shah M.S - Computer Sc

Re: Reg. License of Princeton WordNet

2016-04-01 Thread Sujen Shah
Hi Bhavya, Could you provide links to the libraries you are trying to leverage. Thanks! On Apr 1, 2016 4:08 PM, "Bhavya Sanghavi" wrote: > Hi, > > I am planning to integrate WordNet in the Scoring Similarity plugin of > Nutch. I just wanted to confirm that there is no conflict regarding the > li

Recent stackoverflow questions

2016-04-05 Thread Sujen Shah
Hey Devs, Just bringing your attention to recent questions being asked on stackoverflow regarding using Nutch as a service and regarding how to enable/store cookies. http://stackoverflow.com/questions/36425447/can-nutch-be-deployed-to-crawl-specific-pages

Plugin dependancies do not get added to classpath while running Nutch in local mode

2016-09-12 Thread Sujen Shah
ed to modify the root ivy.xml for plugin specific dependencies. I wanted to ask the devs first if there was already a solution before filing a JIRA issue. If not, I'll submit it through JIRA. Thank you for your help. Regards, Sujen Shah plugin-dependency.patch Description: Binary data

Re: Plugin dependancies do not get added to classpath while running Nutch in local mode

2016-09-22 Thread Sujen Shah
/nutch/blob/master/src/plugin/ > parse-tika/plugin.xml > > This double work is not ideal and a frequent cause for errors but that's > how it works right now. > > Cheers, > Sebastian > > > On 09/12/2016 11:56 PM, Sujen Shah wrote: > > Hi Devs, > > > >

Re: Plugin dependancies do not get added to classpath while running Nutch in local mode

2016-09-25 Thread Sujen Shah
bsolute, it is used > as is. If relative, it is searched for on the classpath. > > > See also my comments on https://github.com/apache/nutch/pull/152 > > Sebastian > > > On 09/23/2016 12:06 AM, Sujen Shah wrote: > > Thank you Sebastian for your response. > >

Re: Nutch Cosine Filter

2016-12-02 Thread Sujen Shah
Hi Thank you for your feedback! Appreciate it. Currently, there are no tools apart from the ones you have already experimented with (topN and generate.min.score) to direct the crawl towards the top scoring urls. I wonder why did the generate.min.score did not work. I looked in to the code and it

[jira] [Created] (NUTCH-1931) Apache Nutch 1.x REST service and crawler visualization

2015-02-02 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-1931: - Summary: Apache Nutch 1.x REST service and crawler visualization Key: NUTCH-1931 URL: https://issues.apache.org/jira/browse/NUTCH-1931 Project: Nutch Issue Type

[jira] [Created] (NUTCH-1966) Configuration endpoint for 1x REST API [A sub-issue of NUTCH-1931]

2015-03-17 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-1966: - Summary: Configuration endpoint for 1x REST API [A sub-issue of NUTCH-1931] Key: NUTCH-1966 URL: https://issues.apache.org/jira/browse/NUTCH-1966 Project: Nutch

[jira] [Updated] (NUTCH-1966) Configuration endpoint for 1x REST API [A sub-issue of NUTCH-1931]

2015-03-17 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-1966: -- Issue Type: Sub-task (was: Task) Parent: NUTCH-1931 > Configuration endpoint for 1x REST

[jira] [Commented] (NUTCH-1931) Apache Nutch 1.x REST service and crawler visualization

2015-03-17 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14365514#comment-14365514 ] Sujen Shah commented on NUTCH-1931: --- Hi, I created a sub task as making a pull req

[jira] [Commented] (NUTCH-1966) Configuration endpoint for 1x REST API [A sub-issue of NUTCH-1931]

2015-03-17 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14365517#comment-14365517 ] Sujen Shah commented on NUTCH-1966: --- The link to the pull request https://github

[jira] [Created] (NUTCH-1973) Job Administration end point for the REST service

2015-03-20 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-1973: - Summary: Job Administration end point for the REST service Key: NUTCH-1973 URL: https://issues.apache.org/jira/browse/NUTCH-1973 Project: Nutch Issue Type: Sub

[jira] [Commented] (NUTCH-1973) Job Administration end point for the REST service

2015-03-31 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389845#comment-14389845 ] Sujen Shah commented on NUTCH-1973: --- Github pull request created - https://github

[jira] [Updated] (NUTCH-1973) Job Administration end point for the REST service

2015-04-21 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-1973: -- Attachment: NUTCH-1973.patch I have created a patch by hand, can you try this one. >

[jira] [Commented] (NUTCH-1973) Job Administration end point for the REST service

2015-04-21 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506372#comment-14506372 ] Sujen Shah commented on NUTCH-1973: --- Keep the model/request one > Job Adminis

[jira] [Created] (NUTCH-2011) Endpoint to support realtime JSON output from the fetcher

2015-05-15 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2011: - Summary: Endpoint to support realtime JSON output from the fetcher Key: NUTCH-2011 URL: https://issues.apache.org/jira/browse/NUTCH-2011 Project: Nutch Issue Type

[jira] [Commented] (NUTCH-2011) Endpoint to support realtime JSON output from the fetcher

2015-05-15 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545603#comment-14545603 ] Sujen Shah commented on NUTCH-2011: --- PR link - https://github.com/apache/nutch/pul

[jira] [Created] (NUTCH-2015) Make FetchNodeDb optional (off by default) if NutchServer is not used

2015-05-16 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2015: - Summary: Make FetchNodeDb optional (off by default) if NutchServer is not used Key: NUTCH-2015 URL: https://issues.apache.org/jira/browse/NUTCH-2015 Project: Nutch

[jira] [Commented] (NUTCH-2015) Make FetchNodeDb optional (off by default) if NutchServer is not used

2015-05-16 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546648#comment-14546648 ] Sujen Shah commented on NUTCH-2015: --- PR link - https://github.com/apache/nutch/pul

[jira] [Commented] (NUTCH-2011) Endpoint to support realtime JSON output from the fetcher

2015-05-16 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546650#comment-14546650 ] Sujen Shah commented on NUTCH-2011: --- Thank you for your inputs [~wastl-nagel].

[jira] [Commented] (NUTCH-2015) Make FetchNodeDb optional (off by default) if NutchServer is not used

2015-05-17 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547239#comment-14547239 ] Sujen Shah commented on NUTCH-2015: --- Yes true, you are right. Will make the neces

[jira] [Commented] (NUTCH-2011) Endpoint to support realtime JSON output from the fetcher

2015-05-18 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547812#comment-14547812 ] Sujen Shah commented on NUTCH-2011: --- Hi [~wastl-nagel], Just to add a littl

[jira] [Comment Edited] (NUTCH-2011) Endpoint to support realtime JSON output from the fetcher

2015-05-18 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547812#comment-14547812 ] Sujen Shah edited comment on NUTCH-2011 at 5/18/15 10:0

[jira] [Commented] (NUTCH-2011) Endpoint to support realtime JSON output from the fetcher

2015-05-19 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549957#comment-14549957 ] Sujen Shah commented on NUTCH-2011: --- Thanks for sharing the plugin link will look

[jira] [Commented] (NUTCH-2015) Make FetchNodeDb optional (off by default) if NutchServer is not used

2015-05-29 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14565626#comment-14565626 ] Sujen Shah commented on NUTCH-2015: --- Hi [~chrismattmann], I am testing the cha

[jira] [Commented] (NUTCH-2015) Make FetchNodeDb optional (off by default) if NutchServer is not used

2015-05-29 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14565668#comment-14565668 ] Sujen Shah commented on NUTCH-2015: --- Hi [~wastl-nagel], I updated the code as

[jira] [Created] (NUTCH-2031) Create Admin End point for Nutch 1.x REST service

2015-06-02 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2031: - Summary: Create Admin End point for Nutch 1.x REST service Key: NUTCH-2031 URL: https://issues.apache.org/jira/browse/NUTCH-2031 Project: Nutch Issue Type: Sub

[jira] [Commented] (NUTCH-2031) Create Admin End point for Nutch 1.x REST service

2015-06-02 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14569587#comment-14569587 ] Sujen Shah commented on NUTCH-2031: --- GitHub PR link - https://github.com/apache/n

[jira] [Created] (NUTCH-2037) Job endpoint to support Indexing from the REST API

2015-06-07 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2037: - Summary: Job endpoint to support Indexing from the REST API Key: NUTCH-2037 URL: https://issues.apache.org/jira/browse/NUTCH-2037 Project: Nutch Issue Type: Sub

[jira] [Commented] (NUTCH-2037) Job endpoint to support Indexing from the REST API

2015-06-07 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14576521#comment-14576521 ] Sujen Shah commented on NUTCH-2037: --- I have tested the code using Solr

[jira] [Created] (NUTCH-2039) Relevance based scoring filter

2015-06-10 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2039: - Summary: Relevance based scoring filter Key: NUTCH-2039 URL: https://issues.apache.org/jira/browse/NUTCH-2039 Project: Nutch Issue Type: New Feature

[jira] [Commented] (NUTCH-2039) Relevance based scoring filter

2015-06-14 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585467#comment-14585467 ] Sujen Shah commented on NUTCH-2039: --- To run, enable the plugin (scoring-simila

[jira] [Commented] (NUTCH-2039) Relevance based scoring filter

2015-06-15 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586355#comment-14586355 ] Sujen Shah commented on NUTCH-2039: --- Thank you [~chrismattmann] and [~wastl-nagel]

[jira] [Commented] (NUTCH-2039) Relevance based scoring filter

2015-06-15 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587113#comment-14587113 ] Sujen Shah commented on NUTCH-2039: --- Done, updated the PR. > Relevance based

[jira] [Commented] (NUTCH-2039) Relevance based scoring filter

2015-06-16 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14588976#comment-14588976 ] Sujen Shah commented on NUTCH-2039: --- Thanks [~lewismc], [~wastl-nagel]

[jira] [Created] (NUTCH-2047) Improvements to the relevance scoring plugin

2015-06-24 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2047: - Summary: Improvements to the relevance scoring plugin Key: NUTCH-2047 URL: https://issues.apache.org/jira/browse/NUTCH-2047 Project: Nutch Issue Type: Improvement

[jira] [Updated] (NUTCH-2047) Improvements to the relevance scoring plugin

2015-06-24 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2047: -- Attachment: part-0 This file is a dump of the top 1000 URLs. The model file contained information

[jira] [Comment Edited] (NUTCH-2047) Improvements to the relevance scoring plugin

2015-06-24 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600327#comment-14600327 ] Sujen Shah edited comment on NUTCH-2047 at 6/24/15 11:0

[jira] [Created] (NUTCH-2066) Allow user to specify crawldb and segment db in the Generate JOb REST endpoint

2015-07-23 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2066: - Summary: Allow user to specify crawldb and segment db in the Generate JOb REST endpoint Key: NUTCH-2066 URL: https://issues.apache.org/jira/browse/NUTCH-2066 Project

[jira] [Updated] (NUTCH-2066) Allow user to specify crawldb and segment db in the Generate JOb REST endpoint

2015-07-23 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2066: -- Priority: Minor (was: Major) Fix Version/s: 1.11 Component/s: REST_api > Allow user

[jira] [Created] (NUTCH-2070) Allow user to specify segment to Fetch via the REST API

2015-07-29 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2070: - Summary: Allow user to specify segment to Fetch via the REST API Key: NUTCH-2070 URL: https://issues.apache.org/jira/browse/NUTCH-2070 Project: Nutch Issue Type

[jira] [Updated] (NUTCH-2070) Allow user to specify segment to Fetch via the REST API

2015-07-29 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2070: -- Attachment: NUTCH-2070.patch Uploading a patch file. > Allow user to specify segment to Fetch via

[jira] [Updated] (NUTCH-2070) Parameterize Fetch REST Endpoint

2015-08-19 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2070: -- Labels: memex (was: ) > Parameterize Fetch REST Endpo

[jira] [Updated] (NUTCH-2070) Parameterize Fetch REST Endpoint

2015-08-19 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2070: -- Summary: Parameterize Fetch REST Endpoint (was: Allow user to specify segment to Fetch via the REST

[jira] [Created] (NUTCH-2086) Nutch 1.X Webui

2015-08-25 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2086: - Summary: Nutch 1.X Webui Key: NUTCH-2086 URL: https://issues.apache.org/jira/browse/NUTCH-2086 Project: Nutch Issue Type: New Feature Components

[jira] [Created] (NUTCH-2090) Refactor Seed Resource

2015-09-03 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2090: - Summary: Refactor Seed Resource Key: NUTCH-2090 URL: https://issues.apache.org/jira/browse/NUTCH-2090 Project: Nutch Issue Type: Sub-task Components

[jira] [Updated] (NUTCH-2090) Refactor Seed Resource in REST API

2015-09-04 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2090: -- Summary: Refactor Seed Resource in REST API (was: Refactor Seed Resource ) > Refactor Seed Resource

[jira] [Updated] (NUTCH-2090) Refactor Seed Resource in REST API

2015-09-04 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2090: -- Attachment: NUTCH-2090.patch Attaching a Git Patch > Refactor Seed Resource in REST

[jira] [Created] (NUTCH-2092) Unit Test for NutchServer

2015-09-08 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2092: - Summary: Unit Test for NutchServer Key: NUTCH-2092 URL: https://issues.apache.org/jira/browse/NUTCH-2092 Project: Nutch Issue Type: Sub-task Components

[jira] [Commented] (NUTCH-2086) Nutch 1.X Webui

2015-09-14 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743775#comment-14743775 ] Sujen Shah commented on NUTCH-2086: --- Hi [~lewismc] and [~chrismattmann], I was fa

[jira] [Comment Edited] (NUTCH-2086) Nutch 1.X Webui

2015-09-14 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743775#comment-14743775 ] Sujen Shah edited comment on NUTCH-2086 at 9/14/15 4:2

[jira] [Created] (NUTCH-2099) Refactoring the REST endpoints for integration with webui

2015-09-15 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2099: - Summary: Refactoring the REST endpoints for integration with webui Key: NUTCH-2099 URL: https://issues.apache.org/jira/browse/NUTCH-2099 Project: Nutch Issue Type

[jira] [Commented] (NUTCH-2011) Endpoint to support realtime JSON output from the fetcher

2015-09-18 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14876639#comment-14876639 ] Sujen Shah commented on NUTCH-2011: --- Hi [~ahmadia], There is an implementation of

[jira] [Commented] (NUTCH-2086) Nutch 1.X Webui

2015-09-21 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900321#comment-14900321 ] Sujen Shah commented on NUTCH-2086: --- Hi [~lewismc], the PR for this issue is h

[jira] [Created] (NUTCH-2119) Eclipse shows build path errors on building Nutch

2015-09-24 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2119: - Summary: Eclipse shows build path errors on building Nutch Key: NUTCH-2119 URL: https://issues.apache.org/jira/browse/NUTCH-2119 Project: Nutch Issue Type: Bug

[jira] [Commented] (NUTCH-2119) Eclipse shows build path errors on building Nutch

2015-09-24 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907326#comment-14907326 ] Sujen Shah commented on NUTCH-2119: --- Committed to trunk (1705203) > Eclips

[jira] [Resolved] (NUTCH-2119) Eclipse shows build path errors on building Nutch

2015-09-24 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah resolved NUTCH-2119. --- Resolution: Fixed > Eclipse shows build path errors on building Nu

[jira] [Assigned] (NUTCH-2119) Eclipse shows build path errors on building Nutch

2015-09-24 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah reassigned NUTCH-2119: - Assignee: Sujen Shah > Eclipse shows build path errors on building Nu

[jira] [Created] (NUTCH-2121) Update javadoc link for Hadoop 2.4.0 in default.properties

2015-09-24 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2121: - Summary: Update javadoc link for Hadoop 2.4.0 in default.properties Key: NUTCH-2121 URL: https://issues.apache.org/jira/browse/NUTCH-2121 Project: Nutch Issue

[jira] [Resolved] (NUTCH-2121) Update javadoc link for Hadoop 2.4.0 in default.properties

2015-09-24 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah resolved NUTCH-2121. --- Resolution: Fixed Committed to trunk (1705205) > Update javadoc link for Hadoop 2.4.0

[jira] [Updated] (NUTCH-1966) Configuration endpoint for 1x REST API

2015-09-29 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-1966: -- Summary: Configuration endpoint for 1x REST API (was: Configuration endpoint for 1x REST API [A sub

[jira] [Created] (NUTCH-2128) Refactor configuration end point

2015-09-29 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2128: - Summary: Refactor configuration end point Key: NUTCH-2128 URL: https://issues.apache.org/jira/browse/NUTCH-2128 Project: Nutch Issue Type: Sub-task

[jira] [Assigned] (NUTCH-2128) Refactor configuration end point

2015-10-01 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah reassigned NUTCH-2128: - Assignee: Sujen Shah > Refactor configuration end po

[jira] [Updated] (NUTCH-2123) Seed List REST API returns Text but headers indicate/require JSON

2015-10-01 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2123: -- Attachment: NUTCH-2123.patch Patch for correcting the response headers. > Seed List REST API retu

[jira] [Created] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-02 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2132: - Summary: Publisher/Subscriber model for Nutch to emit events Key: NUTCH-2132 URL: https://issues.apache.org/jira/browse/NUTCH-2132 Project: Nutch Issue Type: New

[jira] [Updated] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-02 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2132: -- Attachment: NUTCH-2132.patch Attaching a patch which describes my idea for a Pub/Sub model. This

[jira] [Commented] (NUTCH-2011) Endpoint to support realtime JSON output from the fetcher

2015-10-02 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940925#comment-14940925 ] Sujen Shah commented on NUTCH-2011: --- Hi [~wastl-nagel], [~chrismattmann] and [~ahm

[jira] [Updated] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-02 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2132: -- Description: It would be nice to have a Pub/Sub model in Nutch to emit certain events (ex- Fetcher

[jira] [Created] (NUTCH-2135) Ant Eclipse build does not include protocol-interactiveselenium

2015-10-09 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2135: - Summary: Ant Eclipse build does not include protocol-interactiveselenium Key: NUTCH-2135 URL: https://issues.apache.org/jira/browse/NUTCH-2135 Project: Nutch

[jira] [Created] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-23 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2149: - Summary: REST endpoint to read Nutch sequence files Key: NUTCH-2149 URL: https://issues.apache.org/jira/browse/NUTCH-2149 Project: Nutch Issue Type: New Feature

[jira] [Updated] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-23 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2149: -- Description: This endpoint enables reading of the webgraph data like nodes, links and any other

[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-25 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973360#comment-14973360 ] Sujen Shah commented on NUTCH-2149: --- Committed 1710468 > REST endpoint to rea

[jira] [Assigned] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-25 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah reassigned NUTCH-2149: - Assignee: Sujen Shah > REST endpoint to read Nutch sequence fi

[jira] [Resolved] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-25 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah resolved NUTCH-2149. --- Resolution: Fixed > REST endpoint to read Nutch sequence fi

[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-25 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973372#comment-14973372 ] Sujen Shah commented on NUTCH-2149: --- Ohh I didn't know that, will do that fr

[jira] [Resolved] (NUTCH-2128) Refactor configuration end point

2015-10-27 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah resolved NUTCH-2128. --- Resolution: Fixed > Refactor configuration end po

[jira] [Assigned] (NUTCH-2070) Parameterize Fetch REST Endpoint

2015-10-27 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah reassigned NUTCH-2070: - Assignee: Sujen Shah > Parameterize Fetch REST Endpo

[jira] [Closed] (NUTCH-2070) Parameterize Fetch REST Endpoint

2015-10-27 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah closed NUTCH-2070. - Resolution: Fixed Implemented as a part of https://issues.apache.org/jira/browse/NUTCH-2099

[jira] [Created] (NUTCH-2151) Service endpoint for REST API

2015-10-27 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2151: - Summary: Service endpoint for REST API Key: NUTCH-2151 URL: https://issues.apache.org/jira/browse/NUTCH-2151 Project: Nutch Issue Type: Sub-task

[jira] [Updated] (NUTCH-2151) Service endpoint for REST API

2015-10-27 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2151: -- Issue Type: New Feature (was: Sub-task) Parent: (was: NUTCH-1931) > Service endpoint

[jira] [Created] (NUTCH-2152) CommonCrawl dump via Service endpoint

2015-10-27 Thread Sujen Shah (JIRA)
Sujen Shah created NUTCH-2152: - Summary: CommonCrawl dump via Service endpoint Key: NUTCH-2152 URL: https://issues.apache.org/jira/browse/NUTCH-2152 Project: Nutch Issue Type: Sub-task

[jira] [Assigned] (NUTCH-2152) CommonCrawl dump via Service endpoint

2015-10-27 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah reassigned NUTCH-2152: - Assignee: Sujen Shah > CommonCrawl dump via Service endpo

[jira] [Work started] (NUTCH-2152) CommonCrawl dump via Service endpoint

2015-10-27 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2152 started by Sujen Shah. - > CommonCrawl dump via Service endpo

[jira] [Updated] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-27 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2132: -- Attachment: PubSub_routingkey.patch Patch to route different crawls with different routingkeys set in

[jira] [Updated] (NUTCH-2152) CommonCrawl dump via Service endpoint

2015-10-27 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2152: -- Attachment: NUTCH-2152.git.patch Here is the first iteration of the patch. The commoncrawl dump via

[jira] [Commented] (NUTCH-2153) Nutch REST API (DB) uses POST instead of GET to request

2015-10-28 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978826#comment-14978826 ] Sujen Shah commented on NUTCH-2153: --- Hi [~ahmadia] and [~chrismattmann], Curre

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-28 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978911#comment-14978911 ] Sujen Shah commented on NUTCH-2132: --- [~ahmadia], bq. One issue I'm having is

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-28 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978942#comment-14978942 ] Sujen Shah commented on NUTCH-2132: --- Yes the first patch does not have that prop

[jira] [Commented] (NUTCH-2132) Publisher/Subscriber model for Nutch to emit events

2015-10-28 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978952#comment-14978952 ] Sujen Shah commented on NUTCH-2132: --- Yes this is taken care of in the second patch.

  1   2   >