Hi,
I would love to build a nutch Query object via API and not using the
Queryparser.
In my case I need the complete set of boolean operators in the query,
so required (AND) and non required (OR) terms and prohibited (NOT).
I notice that in general this would be possible to add a clause in
[
http://issues.apache.org/jira/browse/NUTCH-169?page=comments#action_12362447 ]
Stefan Groschupf commented on NUTCH-169:
I wonder what is the performance impact of this patch - in many places, where
previously we used the static methods on classes
remove static NutchConf
---
Key: NUTCH-169
URL: http://issues.apache.org/jira/browse/NUTCH-169
Project: Nutch
Type: Improvement
Reporter: Stefan Groschupf
Priority: Critical
Fix For: 0.8-dev
Removing the static NutchConf.get
[ http://issues.apache.org/jira/browse/NUTCH-169?page=all ]
Stefan Groschupf updated NUTCH-169:
---
Attachment: nutchConf.patch
The patch was created by Marko Bauhardt with some help from me, so full
credits to Marko!
It remove any access of nutchConf
[
http://issues.apache.org/jira/browse/NUTCH-169?page=comments#action_12362334 ]
Stefan Groschupf commented on NUTCH-169:
I missed to mentioned that is the first version just for discussing and provide
Jerome the changed API it is not the final
Hi Jerome,
I'm not sure but could it happen that with your new html protocol
plugin the ParserFactory fails, since a component require log4j?
May we should than add log4j into the core classpath, since I had
added log4j to the NUTCH_HOME/lib and than the test was running
successfully.
Sure, my mistake.
Am 10.01.2006 um 18:24 schrieb Jérôme Charron:
Hi Stefan,
No in fact, I have refactored the code of protocol-http plugins,
not html
parser.
So, I don't think the log4 error comes from this code.
Regards
Jérôme
--
http://motrech.free.fr/
http://www.frutch.org/
[
http://issues.apache.org/jira/browse/NUTCH-169?page=comments#action_12362393 ]
Stefan Groschupf commented on NUTCH-169:
Great! Thanks a lot Jerome!!! We will continue to fix some smaller bugs we
introduced and JobConf related issue and hopefully
Hi Doug,
in nutch 0.8 the index is not in the segment folder any more.
What was the reason for that? in the context of a web gui it would be
may be better to have the index also in the segment folder, since the
segment folder would be the single item to manage a life-cycle,
Thanks for a
Hi,
is anyone able to run the test suite without any problems?
Stefan
---
company:http://www.media-style.com
forum:http://www.text-mining.org
blog:http://www.find23.net
Am 07.01.2006 um 00:43 schrieb Andrew McNabb:
I'm looking at the Reporter interface, and I would like to verify my
understanding of what it is. It appears to me that
Reporter.setStatus()
is called periodically during an operation to give a human-readable
description of how far the progress
secure jobtracker info pages with a password
Key: NUTCH-166
URL: http://issues.apache.org/jira/browse/NUTCH-166
Project: Nutch
Type: Improvement
Versions: 0.8-dev
Reporter: Stefan Groschupf
Fix For: 0.8
[ http://issues.apache.org/jira/browse/NUTCH-166?page=all ]
Stefan Groschupf updated NUTCH-166:
---
Attachment: passwordPatch.txt
secure jobtracker info pages with a password
Key: NUTCH-166
What bug was that? What is your one-line fix?
http://www.nabble.com/RCP-known-limitation-or-bug--t688207.html
something like:
Object[] values;
method.getReturnType()!=null ? values = (Object[])Array.newInstance
(method.getReturnType(),wrappedValues.length) : values = new Object[0];
I have two more ideas:
1) create NutchConf as interface (not class)
2) make it work as plugin
I like the idea to make the conf as a singleton and understand the
need to be able to integrate nutch.
However I would love to do one first step and later on we can make
this second step. I made
(2) What I'd REALLY like to see is if NutchConf were an interface,
As mentioned, give us some time to get the first step done and than
I'm sure such kind of community contributions are every-time welcome.
May people can work together on this.
Stefan
I like the idea and it is another step in the direction of vertical
search, where I personal see the biggest chance for nutch.
How to implement it? Surprisingly, I think that it's very simple -
just adding a CrawlDatum.policyId field would suffice, assuming we
have a means to store and
Hi,
to move forward in the direction of having a nutch gui, I would love
to start removing the static access of NutchConf.
Based on experience first I would love to get a kind of general
agreement and a 'go' before wasting to much time for an unaccented
solution.
I suggest:
+ removing
I don't fully agree with this. In most such cases, you already have
a NutchConf instance in the method or class context, so it makes
sense to use it in the constructor. You could add these construtors
with all parameters iterated, but I'd expect that the constructors
using NutchConf
Hi,
I also agree and would love to see things changed.
In general I would love to be able to be able to write log files also
in custom storages types.
For example it would be great in case it would be possibe to write
log files into the ndfs or into a database.
Especially for smaller scaled
Different parameters are sent to each address. So params.length
should equal addresses.length, and if params.length==0 then
addresses.length==0 and there's no call to be made. Make sense?
It might be clearer if the test were changed to addresses.length==0.
Yes, this would be better,
[ http://issues.apache.org/jira/browse/NUTCH-154?page=all ]
Stefan Groschupf closed NUTCH-154:
--
Resolution: Won't Fix
Please ask question in the mailing lists, this is a bug tracking tool.
Unable to add/update new files to fetchlist/fetcher
[ http://issues.apache.org/jira/browse/NUTCH-55?page=all ]
Stefan Groschupf closed NUTCH-55:
-
Resolution: Duplicate
Duplicate of NUTCH-59
Create dmoz.org search plugin - incorporate the dmoz.org
title/category/description if available
I'm sending this to you because you are active on the nutch-users
list and I am too lazy to subscribe at this particular moment. Please
pass on / act as you see fit. Wiki itself seems immutable at least
to the likes of me.
-Jeff
= currently
By default the [WWW] file plugin is
Hi,
Can you provide a detailed stacktrace from the log file?
Stefan
Am 25.12.2005 um 23:38 schrieb AJ Chen:
I have seen repeatedly the following severe errors during fetching
400,000 pages with 200 threads. What may cause Host connection
pool not found? This type of error must be avoided,
It's time to do some cleanup of the trunk/ after the mapred merge.
+1
Hi,
Since we know that our httpclient plugin has some problems may it is
sensefully to update to the new library,
I guess this is some work, but may someone is interested to take the
job.:)
http://www.theserverside.com/news/thread.tss?thread_id=38189
ttpClient 3.0 provides the following
Lukas,
the input folder are normally setted by the tools to you can not
change that.
However in case you use a unix box, check that the user that runs
nutch has read and write acess to all the folder defined in the nutch-
site/default.xml.
(I guess that can be the problem, nutch use e.g.
untch-0.8-dev which I
get from nutch-trunk.
Regards,
Lukas
On 12/21/05, Stefan Groschupf [EMAIL PROTECTED] wrote:
Lukas,
the input folder are normally setted by the tools to you can not
change that.
However in case you use a unix box, check that the user that runs
nutch has read and write acess
Hi Andrzej,
wow are really great news!
Using the optimized index, I reported previously that some of the
top-scoring results were missing. As it happens, the missing
results were typically the junk pages with high tf/idf but low
boost. Since we collect up to N hits, going from higher to
Andrzej,
well I'm not ready with digging into the problem but want to ask some
more questions.
BTW I counted 195 places that use NutchConf.get(), so this will be a
bigger patch. :)
As I mentioned I would love to go the inversion of control way, so
not using nutchConf in the constructor
Hi,
right this is a know problem and discussed several times, we should
start solving this. :-)
I suggest that we make the Plugin Class implementing the Configurable
interface. In case a plugin needs any configuration value it will
request them from the plugin instance.
The next step would
mapred.job.tracker.info.port is defined 2 times in the nutch-default.xml
Key: NUTCH-146
URL: http://issues.apache.org/jira/browse/NUTCH-146
Project: Nutch
Type: Bug
Reporter: Stefan
to the ContentProperties mechanism.
I think using an array list is may easier than using properties that
are hosted in properties.
Stefan
Am 21.12.2005 um 01:36 schrieb Paul Baclace:
Stefan Groschupf wrote:
My suggestion is that we change NutchConf is following way:
resourceNames.add
mapred is now trunk...
Am 19.12.2005 um 18:46 schrieb Rafi Iz:
Hi all,
I am currently working with Nutch 0.7.1,
I want to start using the mapred, any ideas where I can find the
latest version.
B.T.W I looked at the path: http://svn.apache.org/repos/asf/lucene/
nutch/branches/
but the only
um 19:47 schrieb Andrzej Bialecki:
Stefan Groschupf wrote:
Anyway today we note that when fetching with http-client the sum
of errors and fetched pages is much less than the size defined
when generating the segment.
Changing to protocol-http solves the problem.
Has anyone also note
By the way, is there an easy way to split the index I have already
have.
I would hate to recrawl all of the 1.9MM URLs again and waste
bandwidth.
Well I do not know any tool that comes with nutch or a other tool
that does it, may there is one.
But to write a java class that creates two
Hi,
until writing theses Test that mades the generation bug reproducable
I discovered another strange behavior.
Following test fail:
public void testConf() throws Exception {
NutchConf conf = NutchConf.get();
conf.setInt(mapred.reduce.tasks, 2);
[
http://issues.apache.org/jira/browse/NUTCH-3?page=comments#action_12360658 ]
Stefan Groschupf commented on NUTCH-3:
--
Thanks. :)
multi values of header discarded
Key: NUTCH-3
URL: http
[
http://issues.apache.org/jira/browse/NUTCH-3?page=comments#action_12360666 ]
Stefan Groschupf commented on NUTCH-3:
--
No problem, I can easily change this, but this will effect a lot of code. Just
give me some hours. I will do aginst the svn since
[
http://issues.apache.org/jira/browse/NUTCH-3?page=comments#action_12360667 ]
Stefan Groschupf commented on NUTCH-3:
--
... the ideas was to leasve api as it is, just add a new getProperties method.
Should we now in general replace setProperty
[ http://issues.apache.org/jira/browse/NUTCH-3?page=all ]
Stefan Groschupf reopened NUTCH-3:
--
improvement Doug suggested
multi values of header discarded
Key: NUTCH-3
URL: http
[ http://issues.apache.org/jira/browse/NUTCH-3?page=all ]
Stefan Groschupf updated NUTCH-3:
-
Attachment: contentPropertiesAddpatch.txt
Better?
multi values of header discarded
Key: NUTCH-3
URL
[
http://issues.apache.org/jira/browse/NUTCH-143?page=comments#action_12360571 ]
Stefan Groschupf commented on NUTCH-143:
Would be great in case you can provide a patch.
Improper error numbers returned on exit
Hi Ledio,
the actually nutch is 0.7 or you can also use the 0.8 branch code.
Also you are using old mailing lists and I suggest you use the apache
nutch user mailing list.
http://lucene.apache.org/nutch/mailing_lists.html
To answer your question, nutch does forward the query to all search
Hi,
found this link on a news site, may some can found this interesting.
An Israeli mathematician, Hillel Tal-Ezer, of the Academic College
of Tel Aviv in Yaffo has written a paper on the faults of google's
mathematical algorithms for page ranking
[ http://issues.apache.org/jira/browse/NUTCH-3?page=all ]
Stefan Groschupf updated NUTCH-3:
-
Attachment: multiValuesPropertyPatch.txt
Attached a patch that adds a getProperties method to the ContentProperties
class to receive a string array of values
[
http://issues.apache.org/jira/browse/NUTCH-140?page=comments#action_12360409 ]
Stefan Groschupf commented on NUTCH-140:
From my point of view this makes things more complicated, why not just use the
extension id, where would be the advantage
Full list of open issues
complete description can be found here :
http://issues.apache.org/jira/secure/IssueNavigator.jspa?
view=fulltempMax=30
Please add a +1 in case you vote for the issue under this issue.
Please keep in mind that this will be more a maintenance release.
NUTCH-141
My personal fav. list
In a day or so I will count all votes and post them.
NUTCH-141 jobdetails.jsp doesnt work on webbrowser safari
+1
NUTCH-140 Add alias capability in parse-plugins.xml file that
allows mimeType-extensionId mapping
NUTCH-139 Standard metadata property names in
- job.setPartitionerClass(PartitionUrlByHost.class); in the generate
method
yes, this line is the one you need to change. The other stuff can be
as it is for now.
Do I only need to change the last line to using HashPartitioner.class,
or do I need to modify the other 2 references as well?
If there is no objection, I will commit these changes in the next
hours.
+ 1!!! :-)
+1!
BTW, did you notice that Jerome committed a patch that makes Content
meta data now case insensitive?
Stefan
Am 13.12.2005 um 18:07 schrieb Chris Mattmann:
Hi Folks,
I was just thinking about the ParseData java.util.Properties
metaata object
and thinking about the way that we store
This has been fixed in the mapred branch, but that patch is not in
0.7.1. This alone might be a reason to make a 0.7.2 release.
May we can get fixed some more parser selection related issue until
next days also and get this into a 0.7.2 release.
I would be happy to see some more parser
Hi geeks,
I have not that much much deep knowledge about the unix file systems,
so my questions what would be the best file system for nutch
distributed file systems data nodes?
Does it make any different using the one or the other file system?
Would reiserFS a good choice?
Thanks for any
Reporter: Stefan Groschupf
Priority: Critical
We notice that segments generated with the map reduce segment generator
contains only 50 % of the expected urls. We had a crawldb with 40 000 urls and
the generate commands only created a 20 000 pages segment. This also happened
with the topN
[ http://issues.apache.org/jira/browse/NUTCH-135?page=all ]
Stefan Groschupf updated NUTCH-135:
---
Attachment: contentProperties_patch_WithContentProperties.txt
missed to add the contentproperties itself to the version control... thanks
Jack!
http
Jack,
sorry there are now 3kb more in the patch :), please give it another
try.
Stefan
Am 10.12.2005 um 15:30 schrieb Jack Tang:
Stefan
It seemed your patch missing
org.apache.nutch.protocol.ContentProperties class, right?
/Jack
On 12/10/05, Stefan Groschupf (JIRA) [EMAIL PROTECTED
Ken,
may the user mailing list would be a better place for such questions.
The size of your index depends on you configuration(what kind of
index filter plugins you use)
You can say a document in the index needs 10KB plus the meta data
like date, content type or category of the page.
Jack,
discussed here in detail:
http://issues.apache.org/jira/browse/NUTCH-133
I will provide a patch just fixing this issue very soon.
Stefan
Am 09.12.2005 um 20:04 schrieb Jack Tang:
Hi
I am going to standardize some fields which I stored in my parser
plugin. But I found that sometimes
: Nutch
Type: Bug
Components: fetcher
Versions: 0.7.1, 0.7
Reporter: Stefan Groschupf
Priority: Critical
Fix For: 0.8-dev, 0.7.2-dev
As described in issue nutch-133, some webservers return http header meta data
not standard conform case insensitive.
This provides many
[ http://issues.apache.org/jira/browse/NUTCH-135?page=all ]
Stefan Groschupf updated NUTCH-135:
---
Attachment: contentProperties_patch.txt
As Doug suggested a patch using TreeMap String.CASE_INSENSITIVE_ORDER that
solve the problem of case insensitive
[
http://issues.apache.org/jira/browse/NUTCH-135?page=comments#action_12360025 ]
Stefan Groschupf commented on NUTCH-135:
Andrzej, that is easy to add to the ContentProperties object and sure I can do
that. However first I would love to get a OK
[ http://issues.apache.org/jira/browse/NUTCH-3?page=all ]
Stefan Groschupf reassigned NUTCH-3:
Assign To: Stefan Groschupf
multi values of header discarded
Key: NUTCH-3
URL: http
[
http://issues.apache.org/jira/browse/NUTCH-133?page=comments#action_12359725 ]
Stefan Groschupf commented on NUTCH-133:
Doug,
ok, I will split things in different patches and open a set of new bugs.
Jerome:
If you take a carefully look to my
[ http://issues.apache.org/jira/browse/NUTCH-133?page=all ]
Stefan Groschupf closed NUTCH-133:
--
Resolution: Won't Fix
We will split the problems described here into a set of bugs to fix things step
by step.
ParserFactory does not work
[
http://issues.apache.org/jira/browse/NUTCH-133?page=comments#action_12359610 ]
Stefan Groschupf commented on NUTCH-133:
Jerome:
Since 3 months or so url extentions and also magic content type detection is
never used. I suggest to assign
)?
Stefan
Am 07.12.2005 um 20:29 schrieb Doug Cutting:
This should work. TestRPC.java has a case which returns void
(ping). Can you send a simple test case that fails?
Doug
Stefan Groschupf wrote:
Hi,
I never used the RCP that intensive so I was surprised to found
this limitation
[
http://issues.apache.org/jira/browse/NUTCH-133?page=comments#action_12359627 ]
Stefan Groschupf commented on NUTCH-133:
Doug, I already attached a unit test that call ParseUtil.parse(Content) and
simulate the different scenarios.
I can extend
Hi,
put the patch to jira.
Actually for the most important packages except of map reduce 0.7 and
0.8 are identically and as far I know Doug is syncronizing things
frequently.
Stefan
Am 06.12.2005 um 17:44 schrieb James Nelson:
Hello, hope this is the right place to ask this.
I'm
Hi,
I never used the RCP that intensive so I was surprised to found this
limitation.
Is it known that the RCP.call method can only call methods that have
a return type?
RCP.java line 152
Object[] values =
(Object[])Array.newInstance(method.getReturnType
(),wrappedValues.length);
ParserFactory does not work as expected
---
Key: NUTCH-133
URL: http://issues.apache.org/jira/browse/NUTCH-133
Project: Nutch
Type: Bug
Versions: 0.8-dev, 0.7.1, 0.7.2-dev
Reporter: Stefan Groschupf
Priority
[ http://issues.apache.org/jira/browse/NUTCH-133?page=all ]
Stefan Groschupf updated NUTCH-133:
---
Attachment: Parserutil_test_patch.txt
A test that reproduce most problems, see a real world sample url in the
conclusion above.
ParserFactory does
[ http://issues.apache.org/jira/browse/NUTCH-133?page=all ]
Stefan Groschupf updated NUTCH-133:
---
Attachment: ParserFactoryPatch_nutch.0.7_patch.txt
A patch that solves the described problems for nutch 0.7.
MimeTypes detection is now REALLY used
Am 02.12.2005 um 10:15 schrieb Andrzej Bialecki:
Yes, this is required to detect unmodified content. A small note:
plain MD5Hash(byte[] content) is quite ineffective for many pages,
e.g. pages with a counter, or with ads. It would be good to provide
a framework for other implementations
Check out the latest source from svn, use the branch called mapred.
This url give you a kick start to install a map reduce system on
several boxes:
http://wiki.media-style.com/display/nutchDocu/setup+a+map+reduce+multi
+box+system
The 0.8 brunch works very well for me, but for sure there some
Am 25.11.2005 um 11:30 schrieb Erik Hatcher:
On 24 Nov 2005, at 23:49, Chris Mattmann wrote:
Dublin core may is good for semantic web, but not for a content
storage.
I completely disagree with that.
Me too.
Do we talk about parsing rdf or do we discuss to store parsed html
text in rdf
Sounds like a problem with the hostnames of your datanodes.
Check that your are able to ping all the datanodes with the hostnames
they had send to the namenode.
check:
bin/nutch ndfs -report to see the hostnames.
Stefan
Am 24.11.2005 um 16:04 schrieb Anton Potehin:
When we start namenode
definition overwrites the first.
So sure multi values for one key in multi files, but we should warn
in case a key is defined two times in the same file.
Could I clarify my suggestion?
Stefan
Am 24.11.2005 um 18:30 schrieb Andrzej Bialecki:
Stefan Groschupf (JIRA) wrote:
second
Jérôme,
A mail archive is a amazing source of information, isn't it?! :-)
To answer your question, just ask your self how many pages per second
your plan to fetch and parse and how much queries per second a lucene
index is able to handle - and you can deliver in the ui.
I have here
Correct me if I'm wrong, but isn't log4j used a lot within Nutch? :-)
No, nutch uses java logging, only some plugins use jar that depends
on log4j.
Stefan
Andrzej,
very interesting!!!
Nutch Summarizer also needlessly re-tokenizes the text over and
over again - perhaps it would be better to save already tokenized
text in parse_text, instead of the raw plain text? After all, the
only use for that text is to index it and then build the
Reporter: Stefan Groschupf
Priority: Blocker
The ndfs client return uncorrect values by using du or ls does not return items.
It looks like there is a problem with the virtual file strcuture, since -du
only reads the meta data, isn't it?
We had moved some data from folder to folder and after
[
http://issues.apache.org/jira/browse/NUTCH-99?page=comments#action_12357616 ]
Stefan Groschupf commented on NUTCH-99:
---
SURE! That is absolutly ok for me!
Thanks a lot Piotr
ports are hardcoded or random
Hi Doug,
a very small improvement suggestion.
Actually the method map in the mapper Interface can throw a IOException.
I would found it better in case it just throw a general Exception
since a map task can fail for other reasons as well, e.g. a in the
map search server scenario you
Hi Johannes,
right, but in case you have 200 boxes and each box need to open 4
different connections to the master.
Than the master has 200 * 4 connections = 800 threads = the limit of
the 2.4 kernel.
In case you open only one conenction per box you are also limited to
run 800 boxes per
Am 11.11.2005 um 11:48 schrieb Apache Wiki:
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki
for change notification.
The following page has been changed by PaulBaclace:
http://wiki.apache.org/nutch/OverviewDeploymentConfigs
New page:
== Overview of
ups, sorry...
Paul, you may should mentioned that this scripts require ssh in a
version higher than 3.8.
A great page!
Stefan
Am 11.11.2005 um 13:45 schrieb Stefan Groschupf:
Am 11.11.2005 um 11:48 schrieb Apache Wiki:
Dear Wiki user,
You have subscribed to a wiki page or wiki category
Hi Doug,
In the future I would like to implement a more automated
distributed search system than Nutch currently has. One way to do
this might be to use MapReduce. Each map task's input could be an
index and some segment data. The map method would serve queries,
i.e., run a Nutch
[
http://issues.apache.org/jira/browse/NUTCH-99?page=comments#action_12357409 ]
Stefan Groschupf commented on NUTCH-99:
---
I'm not sure what you are meaning with catching Exception is overkill.
In case the try to open a server on this port fails
Hi,
see
http://wiki.apache.org/nutch/GettingNutchRunningWithWindows
HTH
Stefan
Am 10.11.2005 um 06:44 schrieb KAAS INFOTECH:
Hi All,
I am new to nutch. I have downloaded latest nutch-0.7.1. I have
Microsoft
window install on my PC with Java home Set. I came to know that
cgywin is
require
Hi,
Do you have any query filter installed?
Stefan
Am 10.11.2005 um 09:37 schrieb Game Now:
Hi all,
I pass some strings to org.apache.nutch.searcher.Query#parse() method,
but I got difference result like below:
parameter string: area:XX, returnedQuery.toString() is: area:XX.
parameter
+1
Am 10.11.2005 um 19:03 schrieb Rod Taylor:
Generator.java.patch
---
company:http://www.media-style.com
forum:http://www.text-mining.org
blog:http://www.find23.net
Hi Jake,
take a look here
http://wiki.media-style.com/display/nutchDocu/Why+nutch+has+a+plugin
+system
This short text already mentioned why a nutch as a plugin system :)
Stefan
Am 10.11.2005 um 20:04 schrieb Apache Wiki:
Dear Wiki user,
You have subscribed to a wiki page or wiki category on
and three copies of chunks are distributed on the slaves. If slave 1
is 90% busy, and 2 is 80% busy, 3 is idle. How does NFS do in this
case?
Actually you have to do that manually, but there will be a
automatically solution later.
Or could you tell me where should I start learning?
The
Pre score calculation is done in the indexer.
Yes it works with complete webcrawls as well, and it works very well
for that. :-)
Stefan
Am 08.11.2005 um 11:22 schrieb Anton Potehin:
What about scoring in mapred? I have looked crawl/crawl.java but I did
not found anything concerned with
nutch use the concepts of segments and yes you are able to update
part of the index by just delete older older segments and generate /
fetch new segments.
Stefan
Am 08.11.2005 um 18:38 schrieb Jack Tang:
Hi
I read GFS document and NFS document on the wiki. One interesting
question here:
That is the sense of the plugin system that each plugin can have own
libraries and do not share or share them with other plugins.
Stefan
Am 07.11.2005 um 16:08 schrieb Byron Miller:
Is there any way to make sure all plugins/modules
reference a standard version of log4j? seems to me
there are
I tried running one datanode per machine connecting back to the
same SAN
but it seemed pretty clunky.
SAN in general is a bad idea. A SAN is too slow for a serious setup.
... and it is the single point of failure...
Better use many local hdd.
Stefan
[
http://issues.apache.org/jira/browse/NUTCH-99?page=comments#action_12356853 ]
Stefan Groschupf commented on NUTCH-99:
---
Is there anything I can improve so one of the developers commit this patch into
the svn?
Thanks in case one of the people
201 - 300 of 341 matches
Mail list logo