Re: [ANNOUNCE] New Nutch committer and PMC - Omkar Reddy

2018-06-21 Thread Omkar Reddy
Thank you very much, Sebastian. Glad to be on board. Cheers, Omkar On 21 June 2018 at 13:48, Sebastian Nagel wrote: > Dear all, > > it is my pleasure to announce that Omkar Reddy has joined us > as a committer and member of the Nutch PMC. Omkar has worked > on upgrading Nutch

Re: Preparing to release Nutch 1.15 ?

2018-06-14 Thread Omkar Reddy
+1 On 14 June 2018 at 03:09, Furkan KAMACI wrote: > +1 > > > 13 Haz 2018 Çar, saat 21:04 tarihinde Joe Obernberger < > joseph.obernber...@gmail.com> şunu yazdı: > >> Woot! >> >> >> >> On 6/11/2018 11:55 AM, Chris Mattmann wrote: >> > ++1! >> > >> > >> > >> > Sounds great. >> > >> > >> > >> >

[jira] [Commented] (NUTCH-2557) protocol-http fails to follow redirections when an HTTP response body is invalid

2018-06-12 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509469#comment-16509469 ] Omkar Reddy commented on NUTCH-2557: A simple and wise solution. Thanks. > protocol-http fa

[jira] [Commented] (NUTCH-2557) protocol-http fails to follow redirections when an HTTP response body is invalid

2018-05-25 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490581#comment-16490581 ] Omkar Reddy commented on NUTCH-2557: I agree, sometimes the http body of bad requests and redirects

[jira] [Commented] (NUTCH-2575) protocol-http does not respect the maximum content-size for chunked responses

2018-05-24 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488900#comment-16488900 ] Omkar Reddy commented on NUTCH-2575: I have taken up [NUTCH-2557|https://issues.apache.org/jira/browse

[jira] [Commented] (NUTCH-2575) protocol-http does not respect the maximum content-size

2018-05-06 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465037#comment-16465037 ] Omkar Reddy commented on NUTCH-2575: Hi [~gbouchar], I see the issue, while reading every chunk we

[jira] [Commented] (NUTCH-2553) Fetcher not to modify URLs to be fetched

2018-04-16 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439219#comment-16439219 ] Omkar Reddy commented on NUTCH-2553: WOW! :O I couldn't have figured that one on my own. Thanks

[jira] [Commented] (NUTCH-2551) NullPointerException in generator

2018-04-16 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439208#comment-16439208 ] Omkar Reddy commented on NUTCH-2551: Hello [~wastl-nagel] I used Hadoop-2.7.4, I will try

[jira] [Commented] (NUTCH-2551) NullPointerException in generator

2018-04-11 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433705#comment-16433705 ] Omkar Reddy commented on NUTCH-2551: [~wastl-nagel], [~HansBrende], [~lewi...@apache.org] please let

[jira] [Commented] (NUTCH-2551) NullPointerException in generator

2018-04-10 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431997#comment-16431997 ] Omkar Reddy commented on NUTCH-2551: Hi, [~HansBrende] I tried reproducing the error in pseudo

[jira] [Commented] (NUTCH-2553) Fetcher not to modify URLs to be fetched

2018-04-10 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431801#comment-16431801 ] Omkar Reddy commented on NUTCH-2553: [~wastl-nagel] I did not add anything that produces this specific

[jira] [Commented] (NUTCH-2551) NullPointerException in generator

2018-04-10 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431798#comment-16431798 ] Omkar Reddy commented on NUTCH-2551: I think the issue here is that a new job is(job.getInstance

[jira] [Commented] (NUTCH-2518) Must check return value of job.waitForCompletion()

2018-03-27 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415426#comment-16415426 ] Omkar Reddy commented on NUTCH-2518: I might have missed this ticket. Hi [~wastl-nagel

[jira] [Commented] (NUTCH-2383) Wrong FS exception in Fetcher

2017-11-08 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16244019#comment-16244019 ] Omkar Reddy commented on NUTCH-2383: I recently faced this issue, we need to set the property

[jira] [Commented] (NUTCH-2442) Injector to stop if job fails to avoid loss of CrawlDb

2017-11-03 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238278#comment-16238278 ] Omkar Reddy commented on NUTCH-2442: [~wastl-nagel] I am working on this on my local branch of NUTCH

Re: Subscription request

2017-09-21 Thread Omkar Reddy
Hi Raffaele, Please send an email to dev-subscr...@nutch.apache.org for subscribing to the developer mailing list. Thanks, Omkar. On 20 September 2017 at 21:58, Raffaele Palmieri wrote: > Ask for subscription. >

[jira] [Updated] (NUTCH-2427) Remove all the Hadoop wildcard imports.

2017-09-20 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Reddy updated NUTCH-2427: --- Labels: easyfix (was: ) > Remove all the Hadoop wildcard impo

[jira] [Created] (NUTCH-2427) Remove all the Hadoop wildcard imports.

2017-09-20 Thread Omkar Reddy (JIRA)
Omkar Reddy created NUTCH-2427: -- Summary: Remove all the Hadoop wildcard imports. Key: NUTCH-2427 URL: https://issues.apache.org/jira/browse/NUTCH-2427 Project: Nutch Issue Type: Improvement

Re: Request for Review

2017-09-11 Thread Omkar Reddy
gt; Nice work Omkar, thumbs up from a fellow student. > > On Sep 10, 2017 10:37 AM, "Omkar Reddy" <omkarreddy2...@gmail.com> wrote: > >> >> Hi Sebastian, >> >> While squashing the pull request there was some mistake and the commits >> were del

Re: Request for Review

2017-09-10 Thread Omkar Reddy
t; Hi, > > thanks, Omkar for your work! > > Just wanted to start testing, but looks like the pull request is lost. > > Thanks, > Sebastian > >> On 09/06/2017 10:57 PM, lewis john mcgibbney wrote: >> Hi user@ and dev@, >> >> As part of the Nutch

Regarding checksum error in hadoop in my latest PR.

2017-08-09 Thread Omkar Reddy
Hello dev@, I am facing an EOFException in the file TestGenerator.java and I cannot get my hands on the way in which I can solve it. The Exception is as follows : 1. 2017-08-09 12:57:06,026 WARN fs.FSInputChecker (ChecksumFileSystem.java:(157)) - Problem opening checksum file:

Google Summer of Code Weekly Reports.

2017-07-12 Thread Omkar Reddy
Hello all, Please find my updated weekly reports here[0]. Please feel free to provide any suggestions. Thanks, Omkar. [0] https://wiki.apache.org/nutch/GoogleSummerOfCode/GraphGeneratorTool/WeeklyReports

GSoC 2017 Weekly Reports

2017-06-15 Thread Omkar Reddy
Hello dev@, I will be updating my weekly reports here [1] and will post the same here. Please feel free to provide any review or comments on the reports. Thanks, Omkar. [1] https://wiki.apache.org/nutch/GoogleSummerOfCode/GraphGeneratorTool/WeeklyReports

Re: GSoC 2017: You are a mentor for Omkar Reddy Gojala

2017-05-09 Thread Omkar Reddy
s year. > > Looking forward to this project. > > Best > > Lewis > > > > > > -- Forwarded message -- > > From: *Google Summer of Code* <summerofcode-nore...@google.com > <mailto:summerofcode-nore...@google.com>

[jira] [Commented] (NUTCH-2375) Upgrade the code base from org.apache.hadoop.mapred to org.apache.hadoop.mapreduce

2017-04-27 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986881#comment-15986881 ] Omkar Reddy commented on NUTCH-2375: Hello dev@, I am using the following url : https

[jira] [Created] (NUTCH-2375) Upgrade the code base from org.apache.hadoop.mapred to org.apache.hadoop.mapreduce

2017-04-20 Thread Omkar Reddy (JIRA)
Omkar Reddy created NUTCH-2375: -- Summary: Upgrade the code base from org.apache.hadoop.mapred to org.apache.hadoop.mapreduce Key: NUTCH-2375 URL: https://issues.apache.org/jira/browse/NUTCH-2375 Project

[jira] [Created] (NUTCH-2372) Javadocs build failing.

2017-04-10 Thread Omkar Reddy (JIRA)
Omkar Reddy created NUTCH-2372: -- Summary: Javadocs build failing. Key: NUTCH-2372 URL: https://issues.apache.org/jira/browse/NUTCH-2372 Project: Nutch Issue Type: Bug Components

[Wiki Update] Added my GSoC proposal.

2017-03-31 Thread Omkar Reddy
Hello dev@, I have added my GSoC proposal to the wiki here [0]. Please add your feedback and comments if any. Thanks, ~Omkar [0]https://wiki.apache.org/nutch/GoogleSummerOfCode/GraphGeneratorTool

Re: Ambiguity in the usage of bin/nutch webgraph.

2017-03-25 Thread Omkar Reddy
> > University of Southern California, Los Angeles, CA 90089 USA > > WWW: http://irds.usc.edu/ > > ++ > > > > > > *From: *Omkar Reddy <omk...@apache.org> > *Reply-To: *"dev@nutch.apache.org" <dev@nutch.apache.org> > *Date: *Thursday, March 23, 201

Ambiguity in the usage of bin/nutch webgraph.

2017-03-23 Thread Omkar Reddy
Hi dev@, I have been trying to create a webgraph from my crawl data using the documentation here [0]. The command format in the documentation is as follows : >> bin/nutch webgraph (-segment | -segmentDir | -webgraphdb ) [-filter -normalize] | -help Upon usage of a command in the above

Re: GSOC2017: Anybody is mentoring and is interested in improving Solr integration

2017-03-22 Thread Omkar Reddy
Hi Alex, I have already raised an issue regarding this on Nutch JIRA here [0], I struggled integrating solr with nutch, Please drop a message here if you are having a bootcamp or a webinar session, I would like to attend it and learn more about solr. Thanks and Regards, Omkar. [0]

[jira] [Commented] (NUTCH-2369) Create a new GraphGenerator Tool for writing Nutch Records as a Full Web Graph

2017-03-16 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929431#comment-15929431 ] Omkar Reddy commented on NUTCH-2369: Branch 1.x [~lewismc]. Thanks. > Create a new GraphGenera

[jira] [Commented] (NUTCH-2366) Deprecated Job constructor in hostdb/ReadHostDb.java

2017-03-16 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929423#comment-15929423 ] Omkar Reddy commented on NUTCH-2366: Yes, this is my first patch [~lewismc] > Deprecated

[jira] [Commented] (NUTCH-2366) Deprecated Job constructor in hostdb/ReadHostDb.java

2017-03-15 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15926746#comment-15926746 ] Omkar Reddy commented on NUTCH-2366: Thank you very much [~markus17] > Deprecated Job construc

[jira] [Commented] (NUTCH-2366) Deprecated Job constructor in hostdb/ReadHostDb.java

2017-03-11 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906265#comment-15906265 ] Omkar Reddy commented on NUTCH-2366: Hi [~markus17], Do I need to send a pull request to the git repo

[jira] [Updated] (NUTCH-2366) Deprecated Job constructor in hostdb/ReadHostDb.java

2017-03-10 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Reddy updated NUTCH-2366: --- Attachment: NUTCH-2366.patch > Deprecated Job constructor in hostdb/ReadHostDb.j

[jira] [Created] (NUTCH-2361) Deprecated nutch and solr integration documentation.

2017-02-21 Thread Omkar Reddy (JIRA)
Omkar Reddy created NUTCH-2361: -- Summary: Deprecated nutch and solr integration documentation. Key: NUTCH-2361 URL: https://issues.apache.org/jira/browse/NUTCH-2361 Project: Nutch Issue Type

Re: Unable to Integrate Solr with Nutch.

2017-02-05 Thread Omkar Reddy
eck here: https://camilotejeiro.wordpress.com/2015/08/27/ > nutch1-solr5-integration-searching-the-web/ > > Kind Regards, > Furkan KAMACI > > On Feb 3, 2017 9:22 AM, "Omkar Reddy" <omkarreddy2...@gmail.com> wrote: > >> Hi Cihad, >> >> To my surprise I am unable to

Re: Google Summer of Code 2017 is coming

2017-02-04 Thread Omkar Reddy
Hello Lewis, I am keen on participating in GSOC 2017. I have started exploring nutch recently and would be glad to contribute to it in the future. Have a great weekend. Thanks, Omkar. On 4 February 2017 at 01:29, lewis john mcgibbney wrote: > Hi Folks, > Please see above.

Re: Unable to Integrate Solr with Nutch.

2017-02-02 Thread Omkar Reddy
uence/display/solr/Schema+ >> Factory+Definition+in+SolrConfig >> >> Regards, >> Cihad Guzel >> >> 2017-02-02 20:51 GMT+03:00 Omkar Reddy <omk...@apache.org>: >> >>> Hello dev@, >>> >>> I am new to Nutch and I have been explori

[jira] [Commented] (NUTCH-2309) Scoring-Similarity Plugin raises NullPointerException when error occurs in fetching URL

2017-02-02 Thread Omkar Reddy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15850238#comment-15850238 ] Omkar Reddy commented on NUTCH-2309: Hi [~jxihong], I tried to reproduce this error but I was unable

Fwd: Unable to Integrate Solr with Nutch.

2017-02-02 Thread Omkar Reddy
Hello dev@, I am new to Nutch and I have been exploring it in the past few days. I tried crawling a website to see things working but I couldn't integrate solr with nutch. I couldn't find schema.xml as mentioned here [0]. When I did some digging I found out that solr has removed schema.xml from