Stephane,
Nutch uses Lucene for indexing, and Lucene has a class called IndexWriter that
is used for indexing Lucene Documents. Here is a quick grep in Nutch's *java
files:
$ ffjg -l IndexWriter
./src/test/org/apache/nutch/indexer/TestDeleteDuplicates.java
./src/java/org/apache/nutch/indexer/I
Hi,
I've been meaning to write this message for a while, and Andrzej's
StrategicGoals made me compose it, finally.
Nutch 0.8 and beyond is very cool, very powerful, and once Hadoop stabilizes,
it will be even more valuable than it is today. However, I think there is
still a need for something
Those using IntelliJ or Eclipse may want to grab code styles for Lucene (and
Solr, Nutch, and Hadoop) that Grant and I put in
https://issues.apache.org/jira/browse/SOLR-245 . I hope they are helpful. The
plan is to stick them on the Wiki (and link from HowToContribute pages?).
Otis
. . .
Hi,
I was about to go assign some JIRA issues to myself and get the commits going
when I noticed that I'm not in Nutch JIRA yet. Could somebody please add me
there?
Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
Hi,
Yes, Nutch has the ability to build N indices and query those N indices,
merging the results.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Mohammad Monirul Hoque <[EMAIL PROTECTED]>
To: nutch-dev@lucene.apache.org
Sent: Sunday, June 2
Hi Dave,
It's really mostly about closing out some of the open bugs and going through
the release process. My guess is we'll have 1.0 this Fall.
Otis
- Original Message
> From: David Grandinetti <[EMAIL PROTECTED]>
> To: nutch-dev@lucene.apache.org
> Sent: Monday, June 23, 2008 5:16
This sounds simple and apparently it's effective...should anyone want to give
it a try:
http://glinden.blogspot.com/2008/08/clever-method-of-near-duplicate.html
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
Hi,
Just found this email is my Nutch folder and as I was reading it was
thinking "Got to ask Dennis if he/they will do the Nutch-Droids integration"
when I saw Dennis' name below. So, Dennis, is Droids on the roadmap for you?
Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - S
Hello,
Quick heads up - I'm about to regenerate the files (HTML + PDF) for the site
and update it tomorrow according to the instructions on
http://wiki.apache.org/nutch/Website_Update_HOWTO . I have Forrest 0.8, and
the site files were last generated with Forrest 0.7, so there will be some
ch
emap=false
# *.failonerror=(true|false) - stop when an XML file is invalid
#forrest.validate.failonerror=true
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message ----
> From: Otis Gospodnetic
> To: Nutch Developer List
> Sent: Monday, January
zed
/home/otis/apache-forrest/main/webapp/resources/schema/relaxng/sitemap-v06.rng:2107:29:
error: datatype library "http://www.w3.org/2001/XMLSchema-datatypes"; not
recognized
BUILD FAILED
/home/otis/apache-forrest/main/targets/validate.xml:158: Validation failed,
messages should hav
Site update
>
> http://www.mail-archive.com/d...@forrest.apache.org/msg15136.html
>
> This might help.
>
> Dennis
>
> Andrzej Bialecki wrote:
> > Otis Gospodnetic wrote:
> >> Below is what it spits out. I'm not sure what the cause is. I did
lations/.svn/foo: Permission denied
[o...@minotaur /www/lucene.apache.org/nutch]$ chmod g+w skin/translations/.svn
chmod: skin/translations/.svn: Operation not permitted
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message ----
> From: Otis Gospodnetic
&g
Lucene doesn't use anything.
Hadoop uses pmd integrate in Hudson.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Doğacan Güney
> To: nutch-dev@lucene.apache.org
> Sent: Tuesday, January 20, 2009 10:49:44 AM
> Subject: Re: [jira] Created: (
oğacan Güney
> To: nutch-dev@lucene.apache.org
> Sent: Tuesday, January 20, 2009 1:13:20 PM
> Subject: Re: [jira] Created: (NUTCH-680) Update external jars to latest
> versions
>
> On Tue, Jan 20, 2009 at 7:48 PM, Otis Gospodnetic
> wrote:
> > Lucene doesn't use
---
> From: Doğacan Güney
> To: nutch-dev@lucene.apache.org
> Sent: Tuesday, January 20, 2009 3:40:20 PM
> Subject: Re: [jira] Created: (NUTCH-680) Update external jars to latest
> versions
>
> On Tue, Jan 20, 2009 at 10:35 PM, Otis Gospodnetic
> wrote:
> > That I do
I believe Lucene has (in contrib/analyzers) a class called WordLoader or
something like that. Perhaps you can use that to load stopwords from a file
(like Solr does) and submit that as a patch?
Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
If you use the Nutch->Solr functionality, you can rely on Solr's MoreLikeThis
and Solr's SpellCheckComponent (both are described on Solr's wiki)
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: dealmaker
> To: nutch-dev@lucene.apache.org
Subject: Re: Is there the functions of "More Like This" and "Spell Checking"?
>
>
> I am not using solr. I am using nutch to search for related urls to a url
> that user type. Can I still use solr's morelikethis in this case?
>
>
> Otis Gospodnetic-
Absolutely! I see you are at home with JIRA, so I don't have to ask. :)
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Frank McCown
> To: nutch-dev@lucene.apache.org
> Sent: Tuesday, March 3, 2009 9:39:24 AM
> Subject: site: operator wi
Hi,
This has been bugging me for a while now. For some reason Nutch MLs get the
most "junk" emails - both rude/rudeish emails, as well as clear spam (with
"SPAM" in the subject - something must be detecting it).
I just looked at the headers of the clearly labeled spam messages and found
th
I absolutely agree. Duplicating the work and focusing on non-core when the
same functionality can be gotten by using Tika is not wise for Nutch.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Andrzej Bialecki
> To: nutch-dev@lucene.apa
ny email. Please check the message
> headers
> to see how this message is routed to you. If it is indeed routed through
> Apache
> servers then please send the headers to me.
>
> Doug
>
> Andrzej Bialecki wrote:
> > Otis Gospodnetic wrote:
> >> Hi,
> >
Hello,
(I saw the first copy of this email went to nutch-user, but I assume nutch-dev
was a resend and the right list to follow-up on)
I agree with the list of core competencies. For example, and I don't know
where I said/wrote this, but I know I said it a few times before -- I think
Solr is
Hi Kirby,
Do you think you could add this to Nutch's JIRA?
Please see http://wiki.apache.org/nutch/HowToContribute
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Kirby Bohling
> To: nutch-dev@lucene.apache.org
> Sent: Thursday, May 28,
Hello,
Has anyone seen this:
http://www.supermind.org/blog/580/java-net-url-synchronization-bottleneck ?
Is this something that needs to be addressed in Nutch (and thus in Bixo, and
thus in the common crawler project)?
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
Personally, I don't see the advantage of Nutch going for a TLP. It's not like
new committers are having a hard time getting in today, it's not like they are
being proposed and rejected. I also don't feel like Nutch lacks
exposure/visibility -- lots of people know about it. It's just that very
Build: plugins' Jars not found
--
Key: NUTCH-347
URL: http://issues.apache.org/jira/browse/NUTCH-347
Project: Nutch
Issue Type: Bug
Affects Versions: 0.8
Reporter: Otis Gospodnetic
[
http://issues.apache.org/jira/browse/NUTCH-233?page=comments#action_12427677 ]
Otis Gospodnetic commented on NUTCH-233:
I haven't noticed this regexp being a problem so far either, but maybe I've
just been lucky not to hav
[
http://issues.apache.org/jira/browse/NUTCH-359?page=comments#action_12433315 ]
Otis Gospodnetic commented on NUTCH-359:
Looks fine and simple (and has a small typo in the last comment). Sami is
doing 0.8.1 soon, so I won't mess
[
http://issues.apache.org/jira/browse/NUTCH-377?page=comments#action_12439016 ]
Otis Gospodnetic commented on NUTCH-377:
You'd need to modify ./src/java/org/apache/nutch/analysis/NutchAnalysis.jj and
regenerate the .java files
[
http://issues.apache.org/jira/browse/NUTCH-387?page=comments#action_12443742 ]
Otis Gospodnetic commented on NUTCH-387:
This indeed looks wrong.
My guess is that the new URL() line just needs to be removed, but I'm not
sur
[
http://issues.apache.org/jira/browse/NUTCH-389?page=comments#action_12444510 ]
Otis Gospodnetic commented on NUTCH-389:
Enis:
Can you give us some examples of how URLs were tokenized before, and how they
are tokenized with your patch
[
http://issues.apache.org/jira/browse/NUTCH-61?page=comments#action_12444514 ]
Otis Gospodnetic commented on NUTCH-61:
---
Has anyone been using the code with this patch applied? Just wondering if/how
well it works.
> Adaptive re-fe
[
https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472078
]
Otis Gospodnetic commented on NUTCH-444:
The ASF FeedParser you are talking about has, I believe, continued
[
https://issues.apache.org/jira/browse/NUTCH-447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12474663
]
Otis Gospodnetic commented on NUTCH-447:
The idea being to limit crawling only to links under a certain
[
https://issues.apache.org/jira/browse/NUTCH-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543427
]
Otis Gospodnetic commented on NUTCH-442:
Doğacan - your comments sound good and I'd guess "bean&quo
[
https://issues.apache.org/jira/browse/NUTCH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547642
]
Otis Gospodnetic commented on NUTCH-585:
A more general solution is needed. This solution should not rely on
[
https://issues.apache.org/jira/browse/NUTCH-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547672
]
Otis Gospodnetic commented on NUTCH-442:
Doğacan -- can you please explain what you mean by "blog up
[
https://issues.apache.org/jira/browse/NUTCH-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547741
]
Otis Gospodnetic commented on NUTCH-442:
Doğacan - ah, good!
The Nutch side of the functionality included in
[
https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577669#action_12577669
]
Otis Gospodnetic commented on NUTCH-296:
Steve:
I was going to say "Gre
Minimize host address lookup
Key: NUTCH-627
URL: https://issues.apache.org/jira/browse/NUTCH-627
Project: Nutch
Issue Type: Improvement
Components: generator
Reporter: Otis Gospodnetic
[
https://issues.apache.org/jira/browse/NUTCH-627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-627:
---
Attachment: NUTCH-627.patch
> Minimize host address loo
[
https://issues.apache.org/jira/browse/NUTCH-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12587786#action_12587786
]
Otis Gospodnetic commented on NUTCH-570:
Ned - are you still using this? S
: fetcher, generator
Reporter: Otis Gospodnetic
Nutch would benefit from having a DB with per-host/domain/TLD information. For
instance, Nutch could detect hosts that are timing out, store information about
that in this DB. Segment/fetchlist Generator could then skip such hosts, so
Reporter: Otis Gospodnetic
Fetch jobs will finish faster if we find a way to prevent servers that are
either slow or time out from slowing down the whole process.
I'll attach a patch that counts per-server exceptions and timeouts and tracks
download speed per server.
Q
[
https://issues.apache.org/jira/browse/NUTCH-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-629:
---
Attachment: NUTCH-629.patch
> Detect slow and timeout servers and drop their U
[
https://issues.apache.org/jira/browse/NUTCH-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588746#action_12588746
]
Otis Gospodnetic commented on NUTCH-629:
While the patch improves fetch speed
[
https://issues.apache.org/jira/browse/NUTCH-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588772#action_12588772
]
Otis Gospodnetic commented on NUTCH-442:
This issue has a lot of votes and a lo
[
https://issues.apache.org/jira/browse/NUTCH-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-628:
---
Attachment: NUTCH-628-DomainStatistics.patch
Enis' DomainStatistics tool from NUTCH-439.
[
https://issues.apache.org/jira/browse/NUTCH-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-628:
---
Attachment: NUTCH-628-HostDb.patch
HostDatum.java
- really just a holds MapWritable
Project: Nutch
> Issue Type: New Feature
> Components: fetcher, generator
> Reporter: Otis Gospodnetic
> Attachments: NUTCH-628-DomainStatistics.patch, NUTCH-628-HostDb.patch
>
>
> Nutch would benefit from having a DB with per-host/domai
Key: NUTCH-628
> URL: https://issues.apache.org/jira/browse/NUTCH-628
> Project: Nutch
> Issue Type: New Feature
> Components: fetcher, generator
>Reporter: Otis Gospodnetic
> Attachments: NUTCH-628-DomainStatistics.pat
t; Issue Type: New Feature
> Components: fetcher, generator
>Reporter: Otis Gospodnetic
> Attachments: NUTCH-628-DomainStatistics.patch, NUTCH-628-HostDb.patch
>
>
> Nutch would benefit from having a DB with per-host/domain/TLD information.
> For insta
[
https://issues.apache.org/jira/browse/NUTCH-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12590486#action_12590486
]
Otis Gospodnetic commented on NUTCH-596:
This looks beautifully simply to me
[
https://issues.apache.org/jira/browse/NUTCH-570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-570:
---
Assignee: Otis Gospodnetic
Another nudge for feedback from Ned or anyone else who tried this
[
https://issues.apache.org/jira/browse/NUTCH-627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic reassigned NUTCH-627:
--
Assignee: Otis Gospodnetic
> Minimize host address loo
[
https://issues.apache.org/jira/browse/NUTCH-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic reassigned NUTCH-629:
--
Assignee: Otis Gospodnetic
> Detect slow and timeout servers and drop their U
[
https://issues.apache.org/jira/browse/NUTCH-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-626:
---
Fix Version/s: 1.0.0
> fetcher2 breaks out the domain with db.ignore.external.links set
[
https://issues.apache.org/jira/browse/NUTCH-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641047#action_12641047
]
Otis Gospodnetic commented on NUTCH-655:
I think we need a generic way for kee
[
https://issues.apache.org/jira/browse/NUTCH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641050#action_12641050
]
Otis Gospodnetic commented on NUTCH-650:
This sounds great, Doğacan! Simplifica
[
https://issues.apache.org/jira/browse/NUTCH-628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641059#action_12641059
]
Otis Gospodnetic commented on NUTCH-628:
After seeing NUTCH-650 I have a fee
[
https://issues.apache.org/jira/browse/NUTCH-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641047#action_12641047
]
otis edited comment on NUTCH-655 at 10/20/08 9:29 AM:
--
I th
[
https://issues.apache.org/jira/browse/NUTCH-660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic resolved NUTCH-660.
Resolution: Invalid
I see you already asked on the list. That's the right place t
[
https://issues.apache.org/jira/browse/NUTCH-659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic resolved NUTCH-659.
Resolution: Invalid
Please ask questions on the mailing list.
> Help! No urls fetched
[
https://issues.apache.org/jira/browse/NUTCH-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-669:
---
Priority: Major (was: Minor)
Fix Version/s: 1.0.0
+1 -- people, vote for it. This
[
https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-563:
---
Fix Version/s: (was: 0.9.0)
1.0.0
> Include custom fields
[
https://issues.apache.org/jira/browse/NUTCH-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658610#action_12658610
]
Otis Gospodnetic commented on NUTCH-675:
Sha Feng, could you please bring thi
[
https://issues.apache.org/jira/browse/NUTCH-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic resolved NUTCH-675.
Resolution: Won't Fix
According to Dennis Kubes's response on the mailing list
[
https://issues.apache.org/jira/browse/NUTCH-171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659610#action_12659610
]
Otis Gospodnetic commented on NUTCH-171:
But does generate.update.crawldb=
[
https://issues.apache.org/jira/browse/NUTCH-171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659639#action_12659639
]
Otis Gospodnetic commented on NUTCH-171:
Hm, yes, it's nice to b
[
https://issues.apache.org/jira/browse/NUTCH-669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659644#action_12659644
]
Otis Gospodnetic commented on NUTCH-669:
I, too, am very anxious to see how the
[
https://issues.apache.org/jira/browse/NUTCH-669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660397#action_12660397
]
Otis Gospodnetic commented on NUTCH-669:
Todd, and when you say "sustaine
[
https://issues.apache.org/jira/browse/NUTCH-627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic resolved NUTCH-627.
Resolution: Fixed
Thanks Otis.
SendingCHANGES.txt
Sendingsrc/java/org
[
https://issues.apache.org/jira/browse/NUTCH-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665482#action_12665482
]
Otis Gospodnetic commented on NUTCH-679:
I'm not sure, but committing this
[
https://issues.apache.org/jira/browse/NUTCH-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666283#action_12666283
]
Otis Gospodnetic commented on NUTCH-655:
1.1 sounds good to me.
> Injectin
[
https://issues.apache.org/jira/browse/NUTCH-628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666290#action_12666290
]
Otis Gospodnetic commented on NUTCH-628:
I'm +1 on getting Domain Stats
[
https://issues.apache.org/jira/browse/NUTCH-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666763#action_12666763
]
Otis Gospodnetic commented on NUTCH-666:
Dennis, could you please describe how
[
https://issues.apache.org/jira/browse/NUTCH-628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666764#action_12666764
]
Otis Gospodnetic commented on NUTCH-628:
Could you take it if you have time, pl
[
https://issues.apache.org/jira/browse/NUTCH-628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668135#action_12668135
]
Otis Gospodnetic commented on NUTCH-628:
Thanks for the update. Sorry, I d
[
https://issues.apache.org/jira/browse/NUTCH-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-707:
---
Fix Version/s: (was: 0.9.0)
1.1
> Generation of multiple segments
[
https://issues.apache.org/jira/browse/NUTCH-736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic resolved NUTCH-736.
Resolution: Invalid
Assignee: Otis Gospodnetic
Please ask questions on nutch-user
[
https://issues.apache.org/jira/browse/NUTCH-731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712489#action_12712489
]
Otis Gospodnetic commented on NUTCH-731:
People have redirects on their robots
[
https://issues.apache.org/jira/browse/NUTCH-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712492#action_12712492
]
Otis Gospodnetic commented on NUTCH-721:
Questions:
Has anyone tried profiling
[
https://issues.apache.org/jira/browse/NUTCH-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712494#action_12712494
]
Otis Gospodnetic commented on NUTCH-721:
Ken's thoughts:
h
[
https://issues.apache.org/jira/browse/NUTCH-693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic reassigned NUTCH-693:
--
Assignee: Otis Gospodnetic
> Add configurable option for treating nofollow behavi
[
https://issues.apache.org/jira/browse/NUTCH-693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713862#action_12713862
]
Otis Gospodnetic commented on NUTCH-693:
I think I see some formatting that
[
https://issues.apache.org/jira/browse/NUTCH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713867#action_12713867
]
Otis Gospodnetic commented on NUTCH-650:
Doğacan, I think http://github.com/dog
[
https://issues.apache.org/jira/browse/NUTCH-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714086#action_12714086
]
Otis Gospodnetic commented on NUTCH-739:
I think there are a few issues
[
https://issues.apache.org/jira/browse/NUTCH-677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714091#action_12714091
]
Otis Gospodnetic commented on NUTCH-677:
Marcin - could you please include
[
https://issues.apache.org/jira/browse/NUTCH-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-740:
---
Priority: Minor (was: Major)
Affects Version/s: (was: 0.9.0)
Fix
[
https://issues.apache.org/jira/browse/NUTCH-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714286#action_12714286
]
Otis Gospodnetic commented on NUTCH-739:
Yes, external optimize calls will wor
[
https://issues.apache.org/jira/browse/NUTCH-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714536#action_12714536
]
Otis Gospodnetic commented on NUTCH-739:
Yeah, sounds right. That Tool should
[
https://issues.apache.org/jira/browse/NUTCH-101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic resolved NUTCH-101.
Resolution: Fixed
Thank you Ken.
> RobotRulesPar
[
https://issues.apache.org/jira/browse/NUTCH-731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-731:
---
Fix Version/s: 1.1
Assignee: Otis Gospodnetic
> Redirection of robots.txt
[
https://issues.apache.org/jira/browse/NUTCH-742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic resolved NUTCH-742.
Resolution: Incomplete
Could you please post more detailed information to nutch-user
[
https://issues.apache.org/jira/browse/NUTCH-746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-746:
---
Patch Info: [Patch Available]
Fix Version/s: 1.1
> NutchBeanConstructor does not cl
[
https://issues.apache.org/jira/browse/NUTCH-738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-738:
---
Affects Version/s: (was: 1.1)
1.0.0
Fix Version/s: 1.1
[
https://issues.apache.org/jira/browse/NUTCH-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Otis Gospodnetic updated NUTCH-740:
---
Assignee: (was: Otis Gospodnetic)
> Configuration option to override default language
[
https://issues.apache.org/jira/browse/NUTCH-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851461#action_12851461
]
Otis Gospodnetic commented on NUTCH-570:
Serykh, what does your version of
1 - 100 of 107 matches
Mail list logo