[ http://issues.apache.org/jira/browse/NUTCH-251?page=all ]
Sami Siren updated NUTCH-251:
-
Fix Version/s: 0.9-dev
(was: 0.8-dev)
Administration GUI
--
Key: NUTCH-251
URL:
[ http://issues.apache.org/jira/browse/NUTCH-318?page=all ]
Sami Siren updated NUTCH-318:
-
Fix Version/s: 0.9-dev
(was: 0.8-dev)
log4j not proper configured, readdb doesnt give any information
[ http://issues.apache.org/jira/browse/NUTCH-322?page=all ]
Sami Siren updated NUTCH-322:
-
Fix Version/s: 0.9-dev
(was: 0.8-dev)
Fetcher discards ProtocolStatus, doesn't store redirected pages
[ http://issues.apache.org/jira/browse/NUTCH-262?page=all ]
Sami Siren updated NUTCH-262:
-
Fix Version/s: 0.9-dev
(was: 0.8-dev)
Summary excerpts and highlights problems
[ http://issues.apache.org/jira/browse/NUTCH-233?page=all ]
Sami Siren updated NUTCH-233:
-
Fix Version/s: 0.9-dev
(was: 0.8-dev)
wrong regular expression hang reduce process for ever
[ http://issues.apache.org/jira/browse/NUTCH-247?page=all ]
Sami Siren updated NUTCH-247:
-
Fix Version/s: 0.9-dev
(was: 0.8-dev)
robot parser to restrict.
-
Key: NUTCH-247
[ http://issues.apache.org/jira/browse/NUTCH-325?page=all ]
Sami Siren updated NUTCH-325:
-
Fix Version/s: 0.9-dev
(was: 0.8-dev)
UrlFilters.java throws NPE in case urlfilter.order contains Filters that are
not in plugin.includes
[
http://issues.apache.org/jira/browse/NUTCH-266?page=comments#action_12422929 ]
Sami Siren commented on NUTCH-266:
--
I finally found the time to setup an environment with cygwin and try this out.
I can confirm that the hadoop.jar version
bin/nutch setting of log path problems on cygwin
Key: NUTCH-327
URL: http://issues.apache.org/jira/browse/NUTCH-327
Project: Nutch
Issue Type: Bug
Affects Versions: 0.8-dev
[ http://issues.apache.org/jira/browse/NUTCH-327?page=all ]
Sami Siren resolved NUTCH-327.
--
Resolution: Fixed
bin/nutch setting of log path problems on cygwin
Key: NUTCH-327
commons-cli-2.0-SNAPSHOT.jar provided with nutch is not compatible with jdk 1.4
---
Key: NUTCH-328
URL: http://issues.apache.org/jira/browse/NUTCH-328
Project: Nutch
[ http://issues.apache.org/jira/browse/NUTCH-328?page=all ]
Sami Siren resolved NUTCH-328.
--
Resolution: Fixed
updated library
commons-cli-2.0-SNAPSHOT.jar provided with nutch is not compatible with jdk
1.4
[
http://issues.apache.org/jira/browse/NUTCH-293?page=comments#action_12421930 ]
Sami Siren commented on NUTCH-293:
--
perhaps instead of
delay = crawlDelay 0 ? crawlDelay : serverDelay;
we could do
delay=Math.max(crawlDelay, serverDelay);
DmozParser does not output urls to stdout
-
Key: NUTCH-320
URL: http://issues.apache.org/jira/browse/NUTCH-320
Project: Nutch
Issue Type: Bug
Affects Versions: 0.8-dev
Reporter: Sami
[ http://issues.apache.org/jira/browse/NUTCH-320?page=all ]
Sami Siren resolved NUTCH-320.
--
Resolution: Fixed
DmozParser does not output urls to stdout
-
Key: NUTCH-320
URL:
[ http://issues.apache.org/jira/browse/NUTCH-172?page=all ]
Sami Siren resolved NUTCH-172:
--
Fix Version: 0.8-dev
Resolution: Fixed
Assign To: Andrzej Bialecki
this has allready been implemented by ab
mergesegs
Segment merger
[ http://issues.apache.org/jira/browse/NUTCH-306?page=all ]
Sami Siren resolved NUTCH-306:
--
Fix Version: 0.8-dev
Resolution: Fixed
just committed this, thanks Grant!
DistributedSearch.Client liveAddresses concurrency problem
[ http://issues.apache.org/jira/browse/NUTCH-110?page=all ]
Sami Siren reassigned NUTCH-110:
Assign To: Sami Siren
OpenSearchServlet outputs illegal xml characters
Key: NUTCH-110
[
http://issues.apache.org/jira/browse/NUTCH-110?page=comments#action_12416932 ]
Sami Siren commented on NUTCH-110:
--
in method addAttribute(...)
line:
attribute.setValue(getLegalXml(getLegalXml(value)));
intentional?
OpenSearchServlet outputs illegal
[ http://issues.apache.org/jira/browse/NUTCH-302?page=all ]
Sami Siren resolved NUTCH-302:
--
Resolution: Fixed
Assign To: Sami Siren
java doc of CrawlDb is wrong
Key: NUTCH-302
URL:
[ http://issues.apache.org/jira/browse/NUTCH-166?page=all ]
Sami Siren resolved NUTCH-166:
--
Resolution: Won't Fix
this is hadoop related
secure jobtracker info pages with a password
Key:
[ http://issues.apache.org/jira/browse/NUTCH-110?page=all ]
Sami Siren resolved NUTCH-110:
--
Fix Version: 0.8-dev
Resolution: Fixed
I just committed this with small changes (moved test to a test case) thanks.
OpenSearchServlet outputs illegal
[ http://issues.apache.org/jira/browse/NUTCH-292?page=all ]
Sami Siren resolved NUTCH-292:
--
Fix Version: 0.8-dev
Resolution: Fixed
Assign To: Sami Siren
I just committed this, thank you!
OpenSearchServlet: OutOfMemoryError: Java heap
[ http://issues.apache.org/jira/browse/NUTCH-156?page=all ]
Sami Siren resolved NUTCH-156:
--
Resolution: Won't Fix
i quess the logging is now handled differently, so old logs sre not overwritten
anymore
nutch-daemon.sh should not overwrite old logs
[
http://issues.apache.org/jira/browse/NUTCH-180?page=comments#action_12416979 ]
Sami Siren commented on NUTCH-180:
--
There's a naive caching implementation under contrib/web2/plugins wich one
might try out and improve
Performance problem with widely
[
http://issues.apache.org/jira/browse/NUTCH-306?page=comments#action_12416673 ]
Sami Siren commented on NUTCH-306:
--
This patch does not seem to apply anymore, can you please attach a patch
against current svn trunk.
DistributedSearch.Client
[ http://issues.apache.org/jira/browse/NUTCH-306?page=all ]
Sami Siren reassigned NUTCH-306:
Assign To: Sami Siren
DistributedSearch.Client liveAddresses concurrency problem
--
Key:
[ http://issues.apache.org/jira/browse/NUTCH-122?page=all ]
Sami Siren resolved NUTCH-122:
--
Resolution: Invalid
this is more related to hadoop
block numbers need a better random number generator
---
[ http://issues.apache.org/jira/browse/NUTCH-187?page=all ]
Sami Siren closed NUTCH-187:
Resolution: Won't Fix
closed as requested
Cannot start Nutch datanodes on Windows outside of a cygwin environment
because of DF
[
http://issues.apache.org/jira/browse/NUTCH-48?page=comments#action_12415016 ]
Sami Siren commented on NUTCH-48:
-
stefan, I tried to apply your combined patch but it seems that the test case
does not compile.
Did you mean query enhancement/refignment
[ http://issues.apache.org/jira/browse/NUTCH-201?page=all ]
Sami Siren resolved NUTCH-201:
--
Resolution: Fixed
just committed this
add support for subcollections
--
Key: NUTCH-201
URL:
[ http://issues.apache.org/jira/browse/NUTCH-280?page=all ]
Sami Siren resolved NUTCH-280:
--
Fix Version: 0.8-dev
Resolution: Fixed
Assign To: Sami Siren
fixed in trunk, thanks for reporting this
url query causes NullPointerException
[ http://issues.apache.org/jira/browse/NUTCH-221?page=all ]
Sami Siren resolved NUTCH-221:
--
Resolution: Fixed
committed
prepare nutch for upcoming lucene 2.0
-
Key: NUTCH-221
URL:
[ http://issues.apache.org/jira/browse/NUTCH-221?page=all ]
Sami Siren updated NUTCH-221:
-
Attachment: nutch-lucene-deprecation.txt
prepare nutch for upcoming lucene 2.0
-
Key: NUTCH-221
URL:
[ http://issues.apache.org/jira/browse/NUTCH-137?page=all ]
Sami Siren resolved NUTCH-137:
--
Fix Version: 0.8-dev
Resolution: Fixed
fixed as related to NUTCH-81
footer is not displayed in search result page
[ http://issues.apache.org/jira/browse/NUTCH-123?page=all ]
Sami Siren closed NUTCH-123:
Fix Version: 0.8-dev
Resolution: Duplicate
problem reported to be fixed in NUTCH-135
Cache.jsp some times generate NullPointerException
[ http://issues.apache.org/jira/browse/NUTCH-64?page=all ]
Sami Siren resolved NUTCH-64:
-
Resolution: Duplicate
duplicate with NUTCH-14
no results after a restart of a search--server (without tomcat restart)
[ http://issues.apache.org/jira/browse/NUTCH-90?page=all ]
Sami Siren resolved NUTCH-90:
-
Resolution: Invalid
doesn't seem to apply anymore
reduce logging output of IndexSegment
-
Key: NUTCH-90
[ http://issues.apache.org/jira/browse/NUTCH-200?page=all ]
Sami Siren resolved NUTCH-200:
--
Fix Version: 0.8-dev
Resolution: Fixed
this is now fixed, thanks
OpenSearch Servlet ist broken
-
Key: NUTCH-200
[ http://issues.apache.org/jira/browse/NUTCH-81?page=all ]
Sami Siren reassigned NUTCH-81:
---
Assign To: Sami Siren
Webapp only works when deployed in root
---
Key: NUTCH-81
URL:
[ http://issues.apache.org/jira/browse/NUTCH-178?page=all ]
Sami Siren reassigned NUTCH-178:
Assign To: Sami Siren
in search.jsp must be session creation false
--
Key: NUTCH-178
URL:
add support for subcollections
--
Key: NUTCH-201
URL: http://issues.apache.org/jira/browse/NUTCH-201
Project: Nutch
Type: New Feature
Versions: 0.8-dev
Reporter: Sami Siren
Assigned to: Sami Siren
Priority: Minor
[ http://issues.apache.org/jira/browse/NUTCH-201?page=all ]
Sami Siren updated NUTCH-201:
-
Attachment: subcollections-1.patch
add support for subcollections
--
Key: NUTCH-201
URL:
[
http://issues.apache.org/jira/browse/NUTCH-193?page=comments#action_12364663 ]
Sami Siren commented on NUTCH-193:
--
+1
I quess the fuse-j - ndfs work from John/me could be part of hadoop /contrib
after this change?
move NDFS and MapReduce to a
[
http://issues.apache.org/jira/browse/NUTCH-44?page=comments#action_12364679 ]
Sami Siren commented on NUTCH-44:
-
Byron, have you made any progress with this?
too many search results
---
Key: NUTCH-44
URL:
[ http://issues.apache.org/jira/browse/NUTCH-146?page=all ]
Sami Siren resolved NUTCH-146:
--
Fix Version: 0.8-dev
Resolution: Fixed
Assign To: Sami Siren
mapred.job.tracker.info.port is defined 2 times in the nutch-default.xml
[ http://issues.apache.org/jira/browse/NUTCH-145?page=all ]
Sami Siren resolved NUTCH-145:
--
Fix Version: 0.8-dev
Resolution: Fixed
Assign To: Sami Siren
this is now committed, thanks
build of war file fails on Chinese (zh) .xml files due
401 - 447 of 447 matches
Mail list logo