[jira] Assigned: (NUTCH-178) in search.jsp must be session creation "false"

2006-02-03 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-178?page=all ] Sami Siren reassigned NUTCH-178: Assign To: Sami Siren > in search.jsp must be session creation "false" > -- > > Key: NUTCH-178 > UR

[jira] Resolved: (NUTCH-178) in search.jsp must be session creation "false"

2006-02-03 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-178?page=all ] Sami Siren resolved NUTCH-178: -- Fix Version: 0.8-dev Resolution: Fixed disabled sessions, thanks for pointing out > in search.jsp must be session creation "false" > -

[jira] Created: (NUTCH-201) add support for subcollections

2006-02-03 Thread Sami Siren (JIRA)
add support for subcollections -- Key: NUTCH-201 URL: http://issues.apache.org/jira/browse/NUTCH-201 Project: Nutch Type: New Feature Versions: 0.8-dev Reporter: Sami Siren Assigned to: Sami Siren Priority: Minor Fix

[jira] Updated: (NUTCH-201) add support for subcollections

2006-02-03 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-201?page=all ] Sami Siren updated NUTCH-201: - Attachment: subcollections-1.patch > add support for subcollections > -- > > Key: NUTCH-201 > URL: http://issues.apache.o

[jira] Updated: (NUTCH-201) add support for subcollections

2006-02-04 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-201?page=all ] Sami Siren updated NUTCH-201: - Attachment: subcollections.2.patch -added missing class -nutch->hadoop api changes > add support for subcollections > -- > > Key:

[jira] Assigned: (NUTCH-200) OpenSearch Servlet ist broken

2006-02-06 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-200?page=all ] Sami Siren reassigned NUTCH-200: Assign To: Sami Siren > OpenSearch Servlet ist broken > - > > Key: NUTCH-200 > URL: http://issues.apache.org/jira/b

[jira] Resolved: (NUTCH-200) OpenSearch Servlet ist broken

2006-02-06 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-200?page=all ] Sami Siren resolved NUTCH-200: -- Fix Version: 0.8-dev Resolution: Fixed this is now fixed, thanks > OpenSearch Servlet ist broken > - > > Key: NUTCH-2

[jira] Assigned: (NUTCH-81) Webapp only works when deployed in root

2006-02-06 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-81?page=all ] Sami Siren reassigned NUTCH-81: --- Assign To: Sami Siren > Webapp only works when deployed in root > --- > > Key: NUTCH-81 > URL: http://issues.a

[jira] Resolved: (NUTCH-81) Webapp only works when deployed in root

2006-02-06 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-81?page=all ] Sami Siren resolved NUTCH-81: - Fix Version: 0.8-dev Resolution: Fixed this is now applied, thanks > Webapp only works when deployed in root > --- > >

[jira] Resolved: (NUTCH-137) footer is not displayed in search result page

2006-02-14 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-137?page=all ] Sami Siren resolved NUTCH-137: -- Fix Version: 0.8-dev Resolution: Fixed fixed as related to NUTCH-81 > footer is not displayed in search result page > ---

[jira] Resolved: (NUTCH-118) FAQ link points to invalid URL

2006-02-14 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-118?page=all ] Sami Siren resolved NUTCH-118: -- Fix Version: 0.8-dev Resolution: Fixed fixed as related to NUTCH-81 > FAQ link points to invalid URL > -- > > Key: N

[jira] Assigned: (NUTCH-184) Serbian (sr, Cyrilic) and Serbo-Croatian (sh, Latin) translation

2006-02-14 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-184?page=all ] Sami Siren reassigned NUTCH-184: Assign To: Sami Siren > Serbian (sr, Cyrilic) and Serbo-Croatian (sh, Latin) translation > > >

[jira] Commented: (NUTCH-165) object pooling for nutch bean --- to impriove performance

2006-02-14 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-165?page=comments#action_12366349 ] Sami Siren commented on NUTCH-165: -- NutchBean is allready cached in application context by servlet container isn't this suffient? > object pooling for nutch bean --- to impri

[jira] Closed: (NUTCH-123) Cache.jsp some times generate NullPointerException

2006-02-14 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-123?page=all ] Sami Siren closed NUTCH-123: Fix Version: 0.8-dev Resolution: Duplicate problem reported to be fixed in NUTCH-135 > Cache.jsp some times generate NullPointerException > -

[jira] Resolved: (NUTCH-184) Serbian (sr, Cyrilic) and Serbo-Croatian (sh, Latin) translation

2006-02-14 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-184?page=all ] Sami Siren resolved NUTCH-184: -- Fix Version: 0.8-dev Resolution: Fixed This is now committed, thank you! > Serbian (sr, Cyrilic) and Serbo-Croatian (sh, Latin) translation >

[jira] Assigned: (NUTCH-48) "Did you mean" query enhancement/refignment feature request

2006-02-14 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-48?page=all ] Sami Siren reassigned NUTCH-48: --- Assign To: Sami Siren > "Did you mean" query enhancement/refignment feature request > > > Key

[jira] Resolved: (NUTCH-64) no results after a restart of a search--server (without tomcat restart)

2006-02-14 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-64?page=all ] Sami Siren resolved NUTCH-64: - Resolution: Duplicate duplicate with NUTCH-14 > no results after a restart of a search--server (without tomcat restart) >

[jira] Resolved: (NUTCH-90) reduce logging output of IndexSegment

2006-02-14 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-90?page=all ] Sami Siren resolved NUTCH-90: - Resolution: Invalid doesn't seem to apply anymore > reduce logging output of IndexSegment > - > > Key: NUTCH-90 >

[jira] Created: (NUTCH-221) prepare nutch for upcoming lucene 2.0

2006-03-03 Thread Sami Siren (JIRA)
prepare nutch for upcoming lucene 2.0 - Key: NUTCH-221 URL: http://issues.apache.org/jira/browse/NUTCH-221 Project: Nutch Type: Task Environment: all Reporter: Sami Siren Assigned to: Sami Siren Priority: Minor Fix

[jira] Updated: (NUTCH-221) prepare nutch for upcoming lucene 2.0

2006-03-03 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-221?page=all ] Sami Siren updated NUTCH-221: - Attachment: nutch-lucene-deprecation.txt > prepare nutch for upcoming lucene 2.0 > - > > Key: NUTCH-221 > URL: ht

[jira] Resolved: (NUTCH-221) prepare nutch for upcoming lucene 2.0

2006-03-05 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-221?page=all ] Sami Siren resolved NUTCH-221: -- Resolution: Fixed committed > prepare nutch for upcoming lucene 2.0 > - > > Key: NUTCH-221 > URL: http://

[jira] Created: (NUTCH-248) add support for internationalized domain names

2006-04-15 Thread Sami Siren (JIRA)
add support for internationalized domain names -- Key: NUTCH-248 URL: http://issues.apache.org/jira/browse/NUTCH-248 Project: Nutch Type: Improvement Components: web gui Reporter: Sami Siren Priority: Minor I

[jira] Resolved: (NUTCH-280) url query causes NullPointerException

2006-05-23 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-280?page=all ] Sami Siren resolved NUTCH-280: -- Fix Version: 0.8-dev Resolution: Fixed Assign To: Sami Siren fixed in trunk, thanks for reporting this > url query causes NullPointerException > ---

[jira] Resolved: (NUTCH-201) add support for subcollections

2006-06-05 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-201?page=all ] Sami Siren resolved NUTCH-201: -- Resolution: Fixed just committed this > add support for subcollections > -- > > Key: NUTCH-201 > URL: http://issu

[jira] Commented: (NUTCH-48) "Did you mean" query enhancement/refignment feature request

2006-06-06 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-48?page=comments#action_12415016 ] Sami Siren commented on NUTCH-48: - stefan, I tried to apply your combined patch but it seems that the test case does not compile. > "Did you mean" query enhancement/refignment

[jira] Assigned: (NUTCH-306) DistributedSearch.Client liveAddresses concurrency problem

2006-06-15 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-306?page=all ] Sami Siren reassigned NUTCH-306: Assign To: Sami Siren > DistributedSearch.Client liveAddresses concurrency problem > -- > > Key:

[jira] Resolved: (NUTCH-122) block numbers need a better random number generator

2006-06-15 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-122?page=all ] Sami Siren resolved NUTCH-122: -- Resolution: Invalid this is more related to hadoop > block numbers need a better random number generator > ---

[jira] Closed: (NUTCH-187) Cannot start Nutch datanodes on Windows outside of a cygwin environment because of DF

2006-06-15 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-187?page=all ] Sami Siren closed NUTCH-187: Resolution: Won't Fix closed as requested > Cannot start Nutch datanodes on Windows outside of a cygwin environment > because of DF > ---

[jira] Commented: (NUTCH-306) DistributedSearch.Client liveAddresses concurrency problem

2006-06-18 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-306?page=comments#action_12416673 ] Sami Siren commented on NUTCH-306: -- This patch does not seem to apply anymore, can you please attach a patch against current svn trunk. > DistributedSearch.Client liveAddres

[jira] Assigned: (NUTCH-110) OpenSearchServlet outputs illegal xml characters

2006-06-20 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-110?page=all ] Sami Siren reassigned NUTCH-110: Assign To: Sami Siren > OpenSearchServlet outputs illegal xml characters > > > Key: NUTCH-110 >

[jira] Commented: (NUTCH-110) OpenSearchServlet outputs illegal xml characters

2006-06-20 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-110?page=comments#action_12416932 ] Sami Siren commented on NUTCH-110: -- in method addAttribute(...) line: attribute.setValue(getLegalXml(getLegalXml(value))); intentional? > OpenSearchServlet outputs illegal

[jira] Resolved: (NUTCH-302) java doc of CrawlDb is wrong

2006-06-20 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-302?page=all ] Sami Siren resolved NUTCH-302: -- Resolution: Fixed Assign To: Sami Siren > java doc of CrawlDb is wrong > > > Key: NUTCH-302 > URL: http://is

[jira] Resolved: (NUTCH-166) secure jobtracker info pages with a password

2006-06-20 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-166?page=all ] Sami Siren resolved NUTCH-166: -- Resolution: Won't Fix this is hadoop related > secure jobtracker info pages with a password > > > Key: NU

[jira] Resolved: (NUTCH-110) OpenSearchServlet outputs illegal xml characters

2006-06-20 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-110?page=all ] Sami Siren resolved NUTCH-110: -- Fix Version: 0.8-dev Resolution: Fixed I just committed this with small changes (moved test to a test case) thanks. > OpenSearchServlet outputs illegal xm

[jira] Resolved: (NUTCH-292) OpenSearchServlet: OutOfMemoryError: Java heap space

2006-06-20 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-292?page=all ] Sami Siren resolved NUTCH-292: -- Fix Version: 0.8-dev Resolution: Fixed Assign To: Sami Siren I just committed this, thank you! > OpenSearchServlet: OutOfMemoryError: Java heap spac

[jira] Resolved: (NUTCH-156) nutch-daemon.sh should not overwrite old logs by default

2006-06-20 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-156?page=all ] Sami Siren resolved NUTCH-156: -- Resolution: Won't Fix i quess the logging is now handled differently, so old logs sre not overwritten anymore > nutch-daemon.sh should not overwrite old logs

[jira] Commented: (NUTCH-180) Performance problem with widely used keywords

2006-06-20 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-180?page=comments#action_12416979 ] Sami Siren commented on NUTCH-180: -- There's a naive caching implementation under contrib/web2/plugins wich one might try out and improve > Performance problem with widely use

[jira] Resolved: (NUTCH-306) DistributedSearch.Client liveAddresses concurrency problem

2006-06-27 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-306?page=all ] Sami Siren resolved NUTCH-306: -- Fix Version: 0.8-dev Resolution: Fixed just committed this, thanks Grant! > DistributedSearch.Client liveAddresses concurrency problem > -

[jira] Created: (NUTCH-315) CrawlDbReader usage text - implementation mismatch

2006-06-28 Thread Sami Siren (JIRA)
CrawlDbReader usage text - implementation mismatch -- Key: NUTCH-315 URL: http://issues.apache.org/jira/browse/NUTCH-315 Project: Nutch Type: Bug Versions: 0.8-dev Reporter: Sami Siren Priority: Trivial

[jira] Resolved: (NUTCH-172) Segment merger

2006-07-11 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-172?page=all ] Sami Siren resolved NUTCH-172: -- Fix Version: 0.8-dev Resolution: Fixed Assign To: Andrzej Bialecki this has allready been implemented by ab mergesegs > Segment merger > -

[jira] Created: (NUTCH-320) DmozParser does not output urls to stdout

2006-07-16 Thread Sami Siren (JIRA)
DmozParser does not output urls to stdout - Key: NUTCH-320 URL: http://issues.apache.org/jira/browse/NUTCH-320 Project: Nutch Issue Type: Bug Affects Versions: 0.8-dev Reporter: Sami Si

[jira] Resolved: (NUTCH-320) DmozParser does not output urls to stdout

2006-07-16 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-320?page=all ] Sami Siren resolved NUTCH-320. -- Resolution: Fixed > DmozParser does not output urls to stdout > - > > Key: NUTCH-320 > URL: h

[jira] Commented: (NUTCH-293) support for Crawl-delay in Robots.txt

2006-07-18 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-293?page=comments#action_12421930 ] Sami Siren commented on NUTCH-293: -- perhaps instead of delay = crawlDelay > 0 ? crawlDelay : serverDelay; we could do delay=Math.max(crawlDelay, serverDelay); als

[jira] Commented: (NUTCH-266) hadoop bug when doing updatedb

2006-07-23 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-266?page=comments#action_12422929 ] Sami Siren commented on NUTCH-266: -- I finally found the time to setup an environment with cygwin and try this out. I can confirm that the hadoop.jar version provid

[jira] Created: (NUTCH-327) bin/nutch setting of log path problems on cygwin

2006-07-23 Thread Sami Siren (JIRA)
bin/nutch setting of log path problems on cygwin Key: NUTCH-327 URL: http://issues.apache.org/jira/browse/NUTCH-327 Project: Nutch Issue Type: Bug Affects Versions: 0.8-dev Enviro

[jira] Resolved: (NUTCH-327) bin/nutch setting of log path problems on cygwin

2006-07-23 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-327?page=all ] Sami Siren resolved NUTCH-327. -- Resolution: Fixed > bin/nutch setting of log path problems on cygwin > > > Key: NUTCH-327 >

[jira] Created: (NUTCH-328) commons-cli-2.0-SNAPSHOT.jar provided with nutch is not compatible with jdk 1.4

2006-07-23 Thread Sami Siren (JIRA)
commons-cli-2.0-SNAPSHOT.jar provided with nutch is not compatible with jdk 1.4 --- Key: NUTCH-328 URL: http://issues.apache.org/jira/browse/NUTCH-328 Project: Nutch

[jira] Resolved: (NUTCH-328) commons-cli-2.0-SNAPSHOT.jar provided with nutch is not compatible with jdk 1.4

2006-07-23 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-328?page=all ] Sami Siren resolved NUTCH-328. -- Resolution: Fixed updated library > commons-cli-2.0-SNAPSHOT.jar provided with nutch is not compatible with jdk > 1.4 > ---

[jira] Updated: (NUTCH-249) black- white list url filtering

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-249?page=all ] Sami Siren updated NUTCH-249: - Fix Version/s: 0.9-dev (was: 0.8-dev) > black- white list url filtering > --- > > Key: NUTCH-249 >

[jira] Updated: (NUTCH-86) LanguageIdentifier API enhancements

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-86?page=all ] Sami Siren updated NUTCH-86: Fix Version/s: 0.9-dev (was: 0.8-dev) > LanguageIdentifier API enhancements > --- > > Key: NUTCH-86

[jira] Updated: (NUTCH-74) French Analyzer Plugin

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-74?page=all ] Sami Siren updated NUTCH-74: Fix Version/s: 0.9-dev (was: 0.8-dev) > French Analyzer Plugin > -- > > Key: NUTCH-74 > URL: ht

[jira] Updated: (NUTCH-246) segment size is never as big as topN or crawlDB size in a distributed deployement

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-246?page=all ] Sami Siren updated NUTCH-246: - Fix Version/s: 0.9-dev (was: 0.8-dev) > segment size is never as big as topN or crawlDB size in a distributed > deployement > -

[jira] Updated: (NUTCH-251) Administration GUI

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-251?page=all ] Sami Siren updated NUTCH-251: - Fix Version/s: 0.9-dev (was: 0.8-dev) > Administration GUI > -- > > Key: NUTCH-251 > URL: http:/

[jira] Updated: (NUTCH-318) log4j not proper configured, readdb doesnt give any information

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-318?page=all ] Sami Siren updated NUTCH-318: - Fix Version/s: 0.9-dev (was: 0.8-dev) > log4j not proper configured, readdb doesnt give any information > --

[jira] Updated: (NUTCH-322) Fetcher discards ProtocolStatus, doesn't store redirected pages

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-322?page=all ] Sami Siren updated NUTCH-322: - Fix Version/s: 0.9-dev (was: 0.8-dev) > Fetcher discards ProtocolStatus, doesn't store redirected pages > --

[jira] Updated: (NUTCH-262) Summary excerpts and highlights problems

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-262?page=all ] Sami Siren updated NUTCH-262: - Fix Version/s: 0.9-dev (was: 0.8-dev) > Summary excerpts and highlights problems > > >

[jira] Updated: (NUTCH-310) Review Log Levels

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-310?page=all ] Sami Siren updated NUTCH-310: - Fix Version/s: 0.9-dev (was: 0.8-dev) > Review Log Levels > - > > Key: NUTCH-310 > URL: http://i

[jira] Updated: (NUTCH-233) wrong regular expression hang reduce process for ever

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-233?page=all ] Sami Siren updated NUTCH-233: - Fix Version/s: 0.9-dev (was: 0.8-dev) > wrong regular expression hang reduce process for ever >

[jira] Updated: (NUTCH-247) robot parser to restrict.

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-247?page=all ] Sami Siren updated NUTCH-247: - Fix Version/s: 0.9-dev (was: 0.8-dev) > robot parser to restrict. > - > > Key: NUTCH-247 >

[jira] Updated: (NUTCH-325) UrlFilters.java throws NPE in case urlfilter.order contains Filters that are not in plugin.includes

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-325?page=all ] Sami Siren updated NUTCH-325: - Fix Version/s: 0.9-dev (was: 0.8-dev) > UrlFilters.java throws NPE in case urlfilter.order contains Filters that are > not in plugin.includes >

[jira] Commented: (NUTCH-318) log4j not proper configured, readdb doesnt give any information

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-318?page=comments#action_12423531 ] Sami Siren commented on NUTCH-318: -- Perhaps this is happening in distributed setup? in 1 machine setup output is done to log file see NUTCH-315 > log4j not proper

[jira] Commented: (NUTCH-318) log4j not proper configured, readdb doesnt give any information

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-318?page=comments#action_12423546 ] Sami Siren commented on NUTCH-318: -- I agree :) so the next thing to do is change readdb -stats to print to stdout, i'll go ahead and do that. Are there any other c

[jira] Resolved: (NUTCH-315) CrawlDbReader usage text - implementation mismatch

2006-07-25 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-315?page=all ] Sami Siren resolved NUTCH-315. -- Resolution: Duplicate duplicate of NUTCH-318 > CrawlDbReader usage text - implementation mismatch > -- > >

[jira] Commented: (NUTCH-318) log4j not proper configured, readdb doesnt give any information

2006-07-26 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-318?page=comments#action_12423557 ] Sami Siren commented on NUTCH-318: -- could this be solved by just adding folowing line into conf/log4j.properties? log4j.logger.org.apache.nutch.crawl.CrawlDbReader

[jira] Commented: (NUTCH-318) log4j not proper configured, readdb doesnt give any information

2006-07-26 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-318?page=comments#action_12423579 ] Sami Siren commented on NUTCH-318: -- i just committed some changes to log4j configuration for some command line tools to trunk, is this satisfactory solution to thi

[jira] Updated: (NUTCH-332) doubling score causes by page internal anchors.

2006-07-28 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-332?page=all ] Sami Siren updated NUTCH-332: - Fix Version/s: 0.9 (was: 0.8) > doubling score causes by page internal anchors. > --- > >

[jira] Updated: (NUTCH-331) Fetcher incorrectly reports task progress to tasktracker resulting in skipped URLs

2006-07-28 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-331?page=all ] Sami Siren updated NUTCH-331: - Fix Version/s: (was: 0.8) > Fetcher incorrectly reports task progress to tasktracker resulting in skipped > URLs > ---

[jira] Updated: (NUTCH-309) Uses commons logging Code Guards

2006-07-28 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-309?page=all ] Sami Siren updated NUTCH-309: - Fix Version/s: 0.9 (was: 0.8) > Uses commons logging Code Guards > > > Key: NUTCH-309 >

[jira] Updated: (NUTCH-261) Multi Language Support

2006-07-28 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-261?page=all ] Sami Siren updated NUTCH-261: - Fix Version/s: 0.9 (was: 0.8) > Multi Language Support > -- > > Key: NUTCH-261 > URL: http:/

[jira] Updated: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2006-07-28 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-258?page=all ] Sami Siren updated NUTCH-258: - Fix Version/s: 0.9 (was: 0.8) > Once Nutch logs a SEVERE log item, Nutch fails forevermore > ---

[jira] Resolved: (NUTCH-318) log4j not proper configured, readdb doesnt give any information

2006-08-01 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-318?page=all ] Sami Siren resolved NUTCH-318. -- Fix Version/s: 0.8.1 Resolution: Fixed Assignee: Sami Siren marking this as resolved because it is now working ok in single node config. > log4j not

[jira] Commented: (NUTCH-266) hadoop bug when doing updatedb

2006-08-01 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-266?page=comments#action_12424930 ] Sami Siren commented on NUTCH-266: -- just adding a remainder: there are two options to get this fixed, use patched version of hadoop-0.4.0 or wait until hadoop-0.5

[jira] Commented: (NUTCH-266) hadoop bug when doing updatedb

2006-08-04 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-266?page=comments#action_12425753 ] Sami Siren commented on NUTCH-266: -- I am planning to build a patched fersion of hadoop 0.4.0 that includes a fix for this problem. If there are no objections I wi

[jira] Created: (NUTCH-339) Refactor nutch to allow fetcher improvements

2006-08-04 Thread Sami Siren (JIRA)
Refactor nutch to allow fetcher improvements - Key: NUTCH-339 URL: http://issues.apache.org/jira/browse/NUTCH-339 Project: Nutch Issue Type: Task Components: fetcher Affects Versions

[jira] Commented: (NUTCH-339) Refactor nutch to allow fetcher improvements

2006-08-04 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-339?page=comments#action_12425782 ] Sami Siren commented on NUTCH-339: -- I am not sure to what you refer to by this 3-4 sec but yes I agree threre are more aspects to optimize in fetcher, what I was f

[jira] Updated: (NUTCH-266) hadoop bug when doing updatedb

2006-08-04 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-266?page=all ] Sami Siren updated NUTCH-266: - Fix Version/s: 0.8.1 0.9.0 > hadoop bug when doing updatedb > -- > > Key: NUTCH-266 > UR

[jira] Updated: (NUTCH-339) Refactor nutch to allow fetcher improvements

2006-08-04 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-339?page=all ] Sami Siren updated NUTCH-339: - Fix Version/s: 0.9.0 Affects Version/s: 0.8 (was: 0.9.0) > Refactor nutch to allow fetcher improvements > --

[jira] Created: (NUTCH-340) Bug(s) in 0.8 tutorial

2006-08-04 Thread Sami Siren (JIRA)
Bug(s) in 0.8 tutorial -- Key: NUTCH-340 URL: http://issues.apache.org/jira/browse/NUTCH-340 Project: Nutch Issue Type: Bug Components: documentation Affects Versions: 0.8 Reporter: Sami Siren

[jira] Commented: (NUTCH-340) Bug(s) in 0.8 tutorial

2006-08-04 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-340?page=comments#action_12425820 ] Sami Siren commented on NUTCH-340: -- thanks for the effort, I however cannot apply your patch. Can you please check out http://wiki.apache.org/nutch/HowToContribute

[jira] Resolved: (NUTCH-340) Bug(s) in 0.8 tutorial

2006-08-05 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-340?page=all ] Sami Siren resolved NUTCH-340. -- Fix Version/s: (was: 0.8.1) Resolution: Fixed I just committed this to svn trunk and updated the website, thanks! > Bug(s) in 0.8 tutorial >

[jira] Closed: (NUTCH-334) I am using the search technique

2006-08-05 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-334?page=all ] Sami Siren closed NUTCH-334. Resolution: Invalid > I am using the search technique > --- > > Key: NUTCH-334 > URL: http://issues.apache.or

[jira] Resolved: (NUTCH-344) Fetcher threads blocked on synchronized block in cleanExpiredServerBlocks

2006-08-08 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-344?page=all ] Sami Siren resolved NUTCH-344. -- Fix Version/s: 0.8.1 0.9.0 Resolution: Fixed I just committed this to 0.8 branch and trunk, thanks Greg! > Fetcher threads blocked on sync

[jira] Resolved: (NUTCH-266) hadoop bug when doing updatedb

2006-08-08 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-266?page=all ] Sami Siren resolved NUTCH-266. -- Resolution: Fixed I just updated hadoop versions, trunk contains 0.5.0, 0.8-branch contains patched 0.4.0 > hadoop bug when doing updatedb > --

[jira] Commented: (NUTCH-347) Build: plugins' Jars not found

2006-08-12 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-347?page=comments#action_12427729 ] Sami Siren commented on NUTCH-347: -- Those warnings are ok - there's not any harm happening. There are some plug-ins (lib-log4j for example) that don't generate any

[jira] Updated: (NUTCH-347) Build: plugins' Jars not found

2006-08-12 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-347?page=all ] Sami Siren updated NUTCH-347: - Attachment: nutch_build_plugins_patch.txt > Build: plugins' Jars not found > -- > > Key: NUTCH-347 > URL: h

[jira] Commented: (NUTCH-349) Port Nutch to use Hadoop Text instead of UTF8

2006-08-16 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-349?page=comments#action_12428399 ] Sami Siren commented on NUTCH-349: -- I anything at all should be done then I'd go for #2. There was also a total incombatibility from 0.7 to 0.8 and I didn't see so

[jira] Created: (NUTCH-351) Protocol forward proxy

2006-08-17 Thread Sami Siren (JIRA)
Protocol forward proxy -- Key: NUTCH-351 URL: http://issues.apache.org/jira/browse/NUTCH-351 Project: Nutch Issue Type: New Feature Components: fetcher Affects Versions: 0.8, 0.8.1, 0.9.0 Reporte

[jira] Updated: (NUTCH-351) Protocol forward proxy

2006-08-17 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-351?page=all ] Sami Siren updated NUTCH-351: - Attachment: protocol-http-proxy-adapter.txt > Protocol forward proxy > -- > > Key: NUTCH-351 > URL: http://issues.a

[jira] Commented: (NUTCH-341) IndexMerger now deletes entire after completing

2006-08-18 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-341?page=comments#action_12429029 ] Sami Siren commented on NUTCH-341: -- +1 for v2 > IndexMerger now deletes entire after completing > > >

[jira] Resolved: (NUTCH-347) Build: plugins' Jars not found

2006-08-18 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-347?page=all ] Sami Siren resolved NUTCH-347. -- Fix Version/s: 0.9.0 Resolution: Fixed Assignee: Sami Siren committed > Build: plugins' Jars not found > -- > >

[jira] Resolved: (NUTCH-338) Remove the text parser as an option for parsing PDF files in parse-plugins.xml

2006-08-18 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-338?page=all ] Sami Siren resolved NUTCH-338. -- Resolution: Fixed This is now committed, thank you. The patch was broken, hopefully I got it right. > Remove the text parser as an option for parsing PDF files in

[jira] Commented: (NUTCH-338) Remove the text parser as an option for parsing PDF files in parse-plugins.xml

2006-08-18 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-338?page=comments#action_12429044 ] Sami Siren commented on NUTCH-338: -- yeah, svn diff from commandline is the winner. > Remove the text parser as an option for parsing PDF files in parse-plugins.xml

[jira] Updated: (NUTCH-338) Remove the text parser as an option for parsing PDF files in parse-plugins.xml

2006-08-18 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-338?page=all ] Sami Siren updated NUTCH-338: - Fix Version/s: 0.8.1 > Remove the text parser as an option for parsing PDF files in parse-plugins.xml > ---

[jira] Created: (NUTCH-360) Switch nutch to use java 5 source format

2006-09-01 Thread Sami Siren (JIRA)
Switch nutch to use java 5 source format Key: NUTCH-360 URL: http://issues.apache.org/jira/browse/NUTCH-360 Project: Nutch Issue Type: Task Affects Versions: 0.9.0 Reporter: Sami Siren

[jira] Resolved: (NUTCH-360) Switch nutch to use java 5 source format

2006-09-01 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-360?page=all ] Sami Siren resolved NUTCH-360. -- Resolution: Fixed done > Switch nutch to use java 5 source format > > > Key: NUTCH-360 > UR

[jira] Commented: (NUTCH-361) generator create fetchlist randomly

2006-09-02 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-361?page=comments#action_12432322 ] Sami Siren commented on NUTCH-361: -- I started to write (allready put some on svn trunk) some simple junit tests for the main tools (inject, generate, fetch). if yo

[jira] Commented: (NUTCH-361) generator create fetchlist randomly

2006-09-06 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-361?page=comments#action_12432861 ] Sami Siren commented on NUTCH-361: -- nightly buils are broken because of this problem, I scratched my head for a long time because my local shource was working perf

[jira] Commented: (NUTCH-361) generator create fetchlist randomly

2006-09-06 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-361?page=comments#action_12432864 ] Sami Siren commented on NUTCH-361: -- oops, pasted wron property mapred.reduce.tasks 1 define mapred.reduce tasks to be number of slave hosts > g

[jira] Commented: (NUTCH-266) hadoop bug when doing updatedb

2006-09-06 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-266?page=comments#action_12432871 ] Sami Siren commented on NUTCH-266: -- what version of nutch are you running? > hadoop bug when doing updatedb > -- > > Ke

[jira] Commented: (NUTCH-361) generator create fetchlist randomly

2006-09-06 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-361?page=comments#action_12432877 ] Sami Siren commented on NUTCH-361: -- I have not tracked hadoop development that intensively so I really have no idea about all the changes from 0.4.x to 0.5.x More

  1   2   3   4   >