pmd-ext contains PMD (http://pmd.sourceforge.net/) libraries. I have
committed them long time ago in an attempt to bring some static
analysis toools to nutch sources. There was a short discussion around
it and we all thought t was worth doing but it never gained enough
momentum. There is a pmd
/20 Piotr Kosiorowski :
pmd-ext contains PMD (http://pmd.sourceforge.net/) libraries. I have
committed them long time ago in an attempt to bring some static
analysis toools to nutch sources. There was a short discussion around
it and we all thought t was worth doing but it never gained
2009/1/20 Piotr Kosiorowski :
pmd-ext contains PMD (http://pmd.sourceforge.net/) libraries. I have
committed them long time ago in an attempt to bring some static
analysis toools to nutch sources. There was a short discussion around
it and we all thought t was worth doing but it never
Chris,
I have documented the process in the wiki. Doug have sent the links
already. If you have any questions I would be willing to help. I can
even do it myself if find it difficult - I simply do not want to be
the bottleneck as I am behind my schedule at work and in private life.
I still hope
Otis,
Some time ago people on the list said that they are willing to at
least maintain Nutch 0.7 branch. As a committer (not very active
recently) I volunteered to commit patches when they appear - I do not
have enough time at the moment to do active coding. I have created a
7.3 release in JIRA
[
https://issues.apache.org/jira/browse/NUTCH-429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Piotr Kosiorowski closed NUTCH-429.
---
Resolution: Invalid
Please use nutch-user mailing list for such questions and JIRA
As no objections were raised I created a 0.7.3 version in JIRA so we can
start assigning current JIRA issues to it.
Regards
Piotr
Piotr Kosiorowski wrote:
Hello committers,
Based on a recent discussion on nutch user list - (Strategic Direction
of Nutch) I would like to prepare 0.7.3 release
Hello committers,
Based on a recent discussion on nutch user list - (Strategic Direction
of Nutch) I would like to prepare 0.7.3 release. The idea is to allow
people who still use 0.7.2 to get rid of most important bugs and allow
them to add some small features they would need as the claim is
Please read the tutorial on nutch site. O suggest posting such issues
to nutch-user - you will have much higher chance of getting useful
response there.
regards
Piotr
On 11/9/06, kauu [EMAIL PROTECTED] wrote:
or it's the same with the version 0.8.x
any idea is preciated
On 11/9/06, kauu [EMAIL
+1
On 10/16/06, Doug Cutting [EMAIL PROTECTED] wrote:
Sami Siren wrote:
looks like somebody just enabled email-to-jira-comments-feature. I was
just wondering would it be good to use this feature more widely.
I think it would be good. That way mailing list discussion would be
logged to the
I had a look at it and it seems I do not have enough permissions to
change it. So probably this one goes to Doug...
P.
Chris Mattmann wrote:
Hey Guys,
Speaking of which, I noticed that Sami's issue below is a Task in JIRA,
which reminded me of a task that I input a long time ago that would be
[ http://issues.apache.org/jira/browse/NUTCH-374?page=all ]
Piotr Kosiorowski reassigned NUTCH-374:
---
Assignee: Piotr Kosiorowski
when http.content.limit be set to -1 and Response.CONTENT_ENCODING is gzip
or x-gzip , it can not fetch any
No objections form me. We waited long and we can fix things in
maitenance release in few weeks.
Regards
Piotr
On 7/26/06, Sami Siren [EMAIL PROTECTED] wrote:
Andrzej Bialecki wrote:
Sami Siren wrote:
There is a package available for testing in
http://people.apache.org/~siren/nutch-0.8/
I think I would log in both situations but different message.
+1
P.
On 7/21/06, Stefan Groschupf [EMAIL PROTECTED] wrote:
Hi Developers,
another thing in the discussion to be more polite.
I suggest that we log a message in case an requested URL was blocked
by a robots.txt.
Optimal would be if
,
is there a reason why this (among other) documentation (for all relevant
versions)
could not be maintained in trunk?
--
Sami Siren
Piotr Kosiorowski wrote:
Andrzej Bialecki wrote:
+1, yes it would be really confusing. Since there are more and more
people trying 0.8, could we perhaps
+1.
P.
Andrzej Bialecki wrote:
Sami Siren wrote:
How would folks feel about releasing 0.8 now, there has been quite a
lot of improvements/new features
since 0.7 series and I strongly feel that we should push the first 0.8
series release (alfa/beta)
out the door now. It would IMO lower the
it so many times that I want to cross check).
Regards
Piotr
Dawid Weiss wrote:
What kind of problems? If you need something, let me know.
D.
Piotr Kosiorowski wrote:
I got some problems while applying Dawid clustering patch (my linux
environment looks not to be setu correctly) - but I switched
I got some problems while applying Dawid clustering patch (my linux
environment looks not to be setu correctly) - but I switched to cygwin
and it looks ok. I will try to commit it today/tommorow.
Regards
Piotr
On 4/12/06, Chris Mattmann [EMAIL PROTECTED] wrote:
Hi Guys,
Any progress on the
Anton Potehin wrote:
Where now placed mapred branch of nutch ?
it is developed in trunk now.
P.
Jérôme Charron wrote:
2) We do have oro 2-0.7 in dependencies (I think urlfilter and similar
things). PMD requires oro - 2.0.8. Do you think we can upgrade (as far
as I know 2.0.7 and 2.0.8 should be compatible)? We would have only one
oro jar than.
Piotr, please keep oro-2.0.8 in pmd-ext
I
I do agree with Jarome - plugins should be checked too.
I would like to integrate PMD for core and plugins over the weekend based on
the Dawid's work - I will make it totally separate target (so test do not
depend on it).
The goal is to allow other developers to play with pmd easily but at the
I will make it totally separate target (so test do not
depend on it).
That was actually Doug's idea (and I agree with it) to stop the build
file if PMD complains about something. It's similar to testing -- if
your tests fail, the entire build file fails.
I totally agree with it - but I
Doug Cutting wrote:
Piotr, would you like to make this release, or should I?
I would prefer you would do it this time - I am not sure if I can find
some time next week. I would like to do some things before release though:
1) Commit clustering patch from Dawid (I took it over from Andrzej).
Hello Christopher,
I personally do not like combining logging with severe error handling
but it is one of the features of Nutch for some time and I do not think
it causes infinite loops in normal installations. Changing it as we are
preparing to release a new version is not a good idea in my
think we can upgrade (as far
as I know 2.0.7 and 2.0.8 should be compatible)? We would have only one
oro jar than.
So happy PMD-ing,
Piotr
Doug Cutting wrote:
Piotr Kosiorowski wrote:
I will make it totally separate target (so test do not
depend on it).
That was actually Doug's idea (and I
+1 - I offer my help - we can coordinate it and I can do a part of work. I
will also try to commit your patches quickly.
Piotr
On 4/6/06, Dawid Weiss [EMAIL PROTECTED] wrote:
Other options (raised on the Hadoop list) are Checkstyle:
PMD seems to be the best choice for an Apache project and
=1465574group_id=56262
D.
Piotr Kosiorowski wrote:
+1 - I offer my help - we can coordinate it and I can do a part of
work. I
will also try to commit your patches quickly.
Piotr
On 4/6/06, Dawid Weiss [EMAIL PROTECTED] wrote:
Other options (raised on the Hadoop list) are Checkstyle:
PMD
[ http://issues.apache.org/jira/browse/NUTCH-239?page=all ]
Piotr Kosiorowski closed NUTCH-239:
---
Fix Version: 0.7.2-dev
Resolution: Fixed
Assign To: Piotr Kosiorowski
Applied with JavaDoc changes. Thanks.
I changed httpclient to use
[ http://issues.apache.org/jira/browse/NUTCH-94?page=all ]
Piotr Kosiorowski closed NUTCH-94:
--
Fix Version: 0.7.2-dev
Resolution: Duplicate
Assign To: Piotr Kosiorowski
Duplicate ofNUTCH-117.
MapFile.Writer throwing 'File exists
[ http://issues.apache.org/jira/browse/NUTCH-14?page=all ]
Piotr Kosiorowski closed NUTCH-14:
--
Resolution: Cannot Reproduce
Closed according to Stefan suggestion
NullPointerException NutchBean.getSummary
[ http://issues.apache.org/jira/browse/NUTCH-117?page=all ]
Piotr Kosiorowski closed NUTCH-117:
---
Fix Version: 0.7.2-dev
Resolution: Fixed
Assign To: Piotr Kosiorowski
Applied fixed by Mike. Also reported offlist by Michal Karwanski
Hi,
I have updated site in 0.7 branch with latest trunk changes. I have
added both tutorials to the site so people will be aware of differences.
I have also committed DOAP file in 0.7 branch.
Nutch Website uses branch-0.7 now.
Piotr
Hello,
I would like to release nutch 0.7.2 in a week or two. Some serious
bugfixes are already covered and I have a plan to fix one or two more.
I found an email from Doug with title [Fwd: Crawler submits forms?]
stating: This has been fixed in the mapred branch, but that patch is
not in
[ http://issues.apache.org/jira/browse/NUTCH-225?page=all ]
Piotr Kosiorowski closed NUTCH-225:
---
Resolution: Won't Fix
I have just updated Nutch Web site. It contains now both tutorials (for 0.7 and
0.8).
I have also added a notr to each
Upps, sorry for ignoring this discussion - i was looking for comments in
JIRA and already committed the change before reading your discussion.
My motivation is to have usable version of tutorial - as simple as it is
possible to be versioned with the sources - only for historical purposes
- if
[ http://issues.apache.org/jira/browse/NUTCH-91?page=all ]
Piotr Kosiorowski closed NUTCH-91:
--
Fix Version: 0.7.2-dev
0.8-dev
Resolution: Fixed
Commited with small extension. Thanks.
empty encoding causes exception
[
http://issues.apache.org/jira/browse/NUTCH-225?page=comments#action_12369405 ]
Piotr Kosiorowski commented on NUTCH-225:
-
As stated in another thread I prefer to have a simple tutorial kept in version
control with releases.
We already have
Hi,
It looks like Nutch web site was updated with site built from latest
trunk - the only problem is it contains tutorial for unreleased (yet)
version 0.8. I think we talked about it and agreed to keep tutorial for
latest release on the Web. I have just updated site in svn (branch-0.7)
with
Andrzej Bialecki wrote:
+1, yes it would be really confusing. Since there are more and more
people trying 0.8, could we perhaps include a short note that 0.8 and
later is NOT compatible with this tutorial, and a reference to the
tutorial for 0.8 (or the trunk/ branch in general)?
I can
[
http://issues.apache.org/jira/browse/NUTCH-79?page=comments#action_12364496 ]
Piotr Kosiorowski commented on NUTCH-79:
I think it should work without changes I suggested in previous comment - they
would be simply useful additions.
I was not using
[ http://issues.apache.org/jira/browse/NUTCH-45?page=all ]
Piotr Kosiorowski closed NUTCH-45:
--
Fix Version: 0.7.2-dev
Resolution: Fixed
Applied. Thanks.
Log corrupt segments in SegmentMergeTool
[ http://issues.apache.org/jira/browse/NUTCH-174?page=all ]
Piotr Kosiorowski closed NUTCH-174:
---
Fix Version: 0.7.2-dev
0.8-dev
Resolution: Fixed
Fixed some time ago during preparation of 0.7.2 release. Please use version
It fails on my machine on parse-ext tests. I am not sure what is causing
it yet and I am afraid I do not have time to investigate it today -
maybe in few days. I did a small change to make it compile a few days
ago, but all tests went ok before I committed it.
Regards
Piotr
Stefan Groschupf
+1 in general
In fact I like the approach presented by Stefan to pass only required
parameters to objects that have small number of configurable params
instead of NutchConf - it makes it obvious which parameters are required
for such basic objects to run and as they are usually building blocks
Andrzej,
Do you think it would be a good idea to commit it in 0.7 branch for
0.7.2 release? I personally prefer to use released libraries instead of
RC if possible. It does not require a lot of changes and you have
already tested it with existing code...
Piotr
[EMAIL PROTECTED] wrote:
[ http://issues.apache.org/jira/browse/NUTCH-142?page=all ]
Piotr Kosiorowski closed NUTCH-142:
---
Fix Version: 0.7.2-dev
0.8-dev
Resolution: Fixed
NutchConf should use the thread context classloader
[
http://issues.apache.org/jira/browse/NUTCH-138?page=comments#action_12361520 ]
Piotr Kosiorowski commented on NUTCH-138:
-
I am not sure but I would suspect it is a problem of bad tomcat configuration.
To handle special characters in query urls
[ http://issues.apache.org/jira/browse/NUTCH-138?page=all ]
Piotr Kosiorowski closed NUTCH-138:
---
Resolution: Invalid
Setting URIEncoding in tomcat config file fixes the problem.
non-Latin-1 characters cannot be submitted for search
[
http://issues.apache.org/jira/browse/NUTCH-138?page=comments#action_12361549 ]
Piotr Kosiorowski commented on NUTCH-138:
-
BTW - just create user for yourself in nutch Wiki and you shoudl be able to add
a new page with information without problems
Andrzej Bialecki wrote:
Hi,
I just commited a large patch to cleanup the trunk/ of obsolete and
broken classes remaining from the 0.7.x development line. Please test
that things still work as they should ...
Hi,
I am not sure what is wrong but a lot of JUnit test simply does not
compile -
AJ Chen wrote:
It would be great if I can add some new functions to the nutch code to
accomplish this. But, if it requires to customize lucene code, that's
fine. I have tried to use the most recent release (1.4.3) of lucene
source code, but it did not work. Is the lucene jar files included
[
http://issues.apache.org/jira/browse/NUTCH-142?page=comments#action_12361492 ]
Piotr Kosiorowski commented on NUTCH-142:
-
Thanks. Fixed in 0.7 branch. Left open to fix it in trunk after cleaning trunk
JUnit test problems (in next few days
[ http://issues.apache.org/jira/browse/NUTCH-42?page=all ]
Piotr Kosiorowski closed NUTCH-42:
--
Fix Version: 0.7.2-dev
0.8-dev
Resolution: Fixed
OpenSearch implemented.
enhance search.jsp such that it can also returns XML
[
http://issues.apache.org/jira/browse/NUTCH-148?page=comments#action_12361206 ]
Piotr Kosiorowski commented on NUTCH-148:
-
'df' command is required for NDFS operation so if you were not using NDFS in
0.7.1 and nutch shell scripts you were able
[ http://issues.apache.org/jira/browse/NUTCH-148?page=all ]
Piotr Kosiorowski closed NUTCH-148:
---
Resolution: Invalid
org.apache.nutch.tools.CrawlTool throws error while doing deleteduplicates
[ http://issues.apache.org/jira/browse/NUTCH-147?page=all ]
Piotr Kosiorowski closed NUTCH-147:
---
Resolution: Invalid
cygwin requirement on Windows is listed in nutch tutorial. Please reopen if
problems persists after using it from cygwin
[
http://issues.apache.org/jira/browse/NUTCH-148?page=comments#action_12361128 ]
Piotr Kosiorowski commented on NUTCH-148:
-
Do you have Cygwin installed?
Is 'df' working in your cygwin installation?
Do you run crawl from cygwin shell?
Nutch
+1 - especially for amount of support Stefan gives to nutch users.
P.
Andrzej Bialecki wrote:
Hi,
During the past year and more Stefan participated actively in the
development, and contributed many high-quality patches. He's been
spending considerable effort on addressing many issues in JIRA,
Doug Cutting wrote:
[EMAIL PROTECTED] wrote:
+/*
+ * (non-Javadoc)
+ * + * @see
org.apache.nutch.io.Writable#write(java.io.DataOutput)
+ */
+public final void write(DataOutput out) throws IOException {
We should either include javadoc or not. In general, all
Hi,
I have problems with JUnit tests in trunk and mapred branches.
TestFetcher fails in both branches. The same test executes correctly in
0.7 branch.
Is it only my problem (environment setup) or others are having it too?
I would suspect some changes in redirect handling
Regards
Piotr
Doug Cutting wrote:
Andrzej Bialecki wrote:
Please also don't forget that the trunk/ will soon be invaded by the
code from mapred, I guess some time around the middle of January (Doug?)
Thinking about this more, perhaps we should do it sooner. There's
already a branch for 0.7.x releases,
Hi,
I started to think about implementing special kind of Lucene Query (if I
remember correctly I would have to write my own Scorer and probably a few
other classes) optimized for Nutch some time ago. I assumed having
specialized query I would be able to avoid accessing some of lucene index
Jérôme Charron wrote:
[...]
build a list of file extensions to include (other ones will be excluded) in
the fecth process.
[...]
I would not like to exclude all others - as for example many extensions
are valid for html - especially dynamicly generated pages (jsp,asp,cgi
just to name the easy
On 11/22/05, Andrzej Bialecki [EMAIL PROTECTED] wrote:
Hi,
I've been profiling a Nutch installation, and to my surprise the largest
amount of throwaway allocations and the most time spent was not in Nutch
specific code, or IPC, but in Lucene ConjunctionScorer.doNext() method.
This method
, Andrzej Bialecki [EMAIL PROTECTED] wrote:
Piotr Kosiorowski wrote:
On 11/22/05, Andrzej Bialecki [EMAIL PROTECTED] wrote:
Hi,
I've been profiling a Nutch installation, and to my surprise the largest
amount of throwaway allocations and the most time spent was not in Nutch
specific code
[ http://issues.apache.org/jira/browse/NUTCH-99?page=all ]
Piotr Kosiorowski closed NUTCH-99:
--
Resolution: Fixed
Patch committed. Thanks Stefan.
ports are hardcoded or random
-
Key: NUTCH-99
URL
EM wrote:
202443 Pages consumed: 13 (at index 13). Links fetched: 233386.
202443 Suspicious outlink count = 30442 for [http://www.dmoz.org/].
202444 Pages consumed: 135000 (at index 135000). Links fetched: 272315.
If there is maxoutlinks already specified in the xml config, why does
Committed in trunk and branch-0.7 (just in case if we decide to make a
0.7.2release sometime).
Thanks
Piotr
On 10/11/05, Stefan Groschupf [EMAIL PROTECTED] wrote:
Hi,
don't think I'm fuddy-duddy but is it really sensefull to do following
in the nutchbean?
File [] directories =
Hello,
I have prepared Nutch 0.7.1 release today but I had one problem. I was
updating the site in branch but to deploy it one must use the version
from trunk. Currently I simply committed generated site in trunk but
this solution is far from perfect.
Should we have version independent site -
Have a look at http://issues.apache.org/jira/browse/NUTCH-48. I think ngram
based appeoach is appropriate here. I was using it in our search engine.
Regards
Piotr
On 9/29/05, Jack Tang [EMAIL PROTECTED] wrote:
Hi
I am very like Google's Did you mean and I notice that nutch now
does not
[ http://issues.apache.org/jira/browse/NUTCH-89?page=all ]
Piotr Kosiorowski closed NUTCH-89:
--
Fix Version: 0.8-dev
0.7
Resolution: Fixed
Applied in trunk and 0.7 branch. Thanks.
parse-rss null pointer exception
[
http://issues.apache.org/jira/browse/NUTCH-95?page=comments#action_12330113 ]
Piotr Kosiorowski commented on NUTCH-95:
I was renaming segments quite often so I would vote for reading the date from
the segment instead of using dir name
Hello,
As it looks everything that was planned was commited to 0.7 branch I would
like to prepare a 0.7.1 release in next few days. I will change branch name
at the same time to comply with agreed standard.
Any objections?
Regards
Piotr
Hi Andrzej,
Is anything related to clustering commits left? Or should we proceed
with 0.7.1 release?
Piotr
[EMAIL PROTECTED] wrote:
Author: ab
Date: Mon Sep 19 07:11:07 2005
New Revision: 290163
URL: http://svn.apache.org/viewcvs?rev=290163view=rev
Log:
Update of the clustering plugin,
Hello Andrzej,
You can also try http://issues.apache.org/jira/browse/NUTCH-79
- I think it should also help here - it is a bit complicated as it
contain additional functionality but if you have any problems I am
willing to help. I am going to perform some test of it again and maybe
commit it
bin/nutch updatedb db $s1
command updates WebDB with links you fetched in segment $s1.
Regards
Piotr
Daniele Menozzi wrote:
Hi all, I have questions regarding org.apache.nutch.tools.CrawlTool: I do
not have really understood what is the ralationship between
depth,segments,fetching..
Take for
Hello,
You cannot do it. These structures where not designed for it. But you can
copy all the data to other ArrayFile skipping entries you want to delete.
Regards
Piotr
On 9/6/05, Ben [EMAIL PROTECTED] wrote:
Hi
How can I delete an entry in the ArrayFile/MapFile if I know the id/key?
Doug Cutting wrote:
Glancing at other Apache projects in subversion, I see that httpd uses
branch names like 2.2.x and tag names like 2.2.4. That's a little
cryptic. I propose that we use branch names like branch-2.4 and tag
names like release-2.4.1. What do folks think?
+1
In fact I
Doug Cutting wrote:
Currently we have three versions of nutch: trunk, 0.7 and mapred. This
increases the chances for conflicts. I would thus like to merge the
mapred branch into trunk soon. The soonest I could actually start this
is next week. Are there any objections?
Doug
+1
P.
Great - I just thought that it would be better if you look at it -
instead of me digging into the code. I wanted to be on the safe side
with 0.7.1 release.
Regards
Piotr
Jérôme Charron wrote:
I am a bit lost but just a quick check - shouldn't it also be committed
in Release-0.7 branch?
No,
Hello,
I do not object against putting lucene-analyzers-1.9-rc1-dev.jar in
nutch core but I would like to give another option. I think it is
possible to create a plugin which contains and exports this library and
make other analysis plugin depend on it. I am not an expert in it but I
think
crawl-urlfilter.txt is bin/nutch crawl specific. If you want to use
each step separatelly - you ar ein fact doing Whole Web crawling
from tutorial - so you need to modify regex-urlfilter.txt instead.
Regards
Piotr
On 8/22/05, Michael Ji [EMAIL PROTECTED] wrote:
Hi,
When I use intranet
Hello Jérôme,
I found it and commited the fix. It was not using UTF-8 encoding sometimes.
But while looking at the code I feel a little bit worried about
LanguageIdentifier.identify(InputStream is) - as it reads bytes from
file in chunks and coverts each chunk to stink separatelly. If multibyte
It works on my Linux box - with both JDK 1.4 and 1.5.
I will try to track it down.
Regards
Piotr
Jérôme Charron wrote:
I am using JDK 1.5 on
Windows - I can test it on 1.4,1.5 on linux tomorrow - maybe this is the
problem.
OK. Thanks
Jérôme
Hello,
I have updated my local copy today and JUnit tests started to fail.
expected:el but was:sv
junit.framework.ComparisonFailure: expected:el but was:sv
at
org.apache.nutch.analysis.lang.TestLanguageIdentifier.testIdentify(Unknown
Source)
at
Hello Nutch Committers,
Is anyone working on preparing the release?
If not I can spent some time on it in an hour or so.
Regards
Piotr
Hello,
I have a problem related to 0.7 release.
After making a tar I was trying to go through crawl tutorial.
- tar xvfz nutch-0.7.tar.gz
bin/nutch - is not executable (and nutch-daemon.sh too).
I thought it was my mistake - I started to do it on Windows so I moved
to linux, but the problem
So I will move the release till tommorow as I am a bit sleepy now.
Regards
Piotr
Doug Cutting wrote:
Piotr Kosiorowski wrote:
After making a tar I was trying to go through crawl tutorial.
- tar xvfz nutch-0.7.tar.gz
bin/nutch - is not executable (and nutch-daemon.sh too).
It is strange
/* /
include name=${final.name}/** /
/tarfileset
tarfileset dir=${build.dir} mode=755
include name=${final.name}/bin/* /
/tarfileset
/tar
/target
I will commit it tommorow and test.
Regards
Piotr
Doug Cutting wrote:
Piotr Kosiorowski wrote:
After making a tar I
Hi,
Maybe it would be a better idea to go for 0.7 branch and schedule a new
0.7.1 release in short time?
It is difficult for me to judge if the patch I had not seen is good for
release. So I would say 0 from me (if you think it is good enough I will
not object).
Regards,
Piotr
Andrzej
Hello,
To change nutch standard html parsing the best place to start would be
probably parse-html plugin.
Regards
Piotr
Fuad Efendi wrote:
1. This is part of ParseText:
Any Accessories Backup Devices Media Barebone Systems Camcorder
Accessories Camcorders Cases External Enclosures CD / DVD
Boost for the page maybe calculated in few different ways (and in few
different places in nutch):
1) PageRank based score
- calculated by nutch analyze command based on WebDB
- during fetchlist generation scores from WebDB are stored in segment
- indexing phase uses score
Hello,
I think a lot of people will wait before moving to mapreduce
implementation for some time so we will have a 0.7 version to support.
I was a heavy CVS branch user in my previous job taking care about
common library so I fully agree that such branch would be needed for bug
fixing. I would
Hello Ben,
I personally would be interested mainly in search part of it if speed
increase would be significant. I am running my indices on linux/ AMD
Opterons - I hope CLucene will work well in this environment. I assume
CLucene is compatible with Java lucene index format as we do have some
Hello Doug,
I read your email ten times and still I am not sure
what the problem is.
Regards,
Piotr
Doug Cutting wrote:
[EMAIL PROTECTED] wrote:
- valuehttp://www.nutch.org/docs/en/bot.html/value
+ valuehttp://lucene.apache.org/nutch/bot.html/value
I think this should now be:
No problem at all. I have a lot to learn yet and it is nice
people like you check my commits for stupid mistakes. Four eyes
are always better than two :).
Regards,
Piotr
Doug Cutting wrote:
Piotr Kosiorowski wrote:
I read your email ten times and still I am not sure
what the problem
Will do it tommorow - I wanted to put down a kind of release checklist
in Wiki - starting with where to change numbers. But would like to cover
also release howto - but in fact I am not sure how to do make a relase
yet. But will try to gather this information.
Regards
Piotr
Andrzej Bialecki
Hello,
Some time ago someone mentioned on the list a problem with nutch
tutorial (I cannot find this email now). I have checked it today and
he/she was right. If you follow the nutch Intranet Crawling tutorial
you will end up with not very interesting index.
This is because it recommends users to
Hello,
I just created an issue in JIRA
http://issues.apache.org/jira/browse/NUTCH-79 containing the code for
fault tolerant searching. I think it is too late to include it in 0.7
release but I would wait for comments and test it in the meantime.
I would like to commit it when release would be
Thanks. It works.
Piotr
Doug Cutting wrote:
Piotr Kosiorowski wrote:
Looking around in JIRA I found out I cannot resolve an issue. I am
not sure how it works but I suspect I lack some rights to do so. Am I
right?
I have added you to the nutch-developers Jira group. Now you should
1 - 100 of 107 matches
Mail list logo