[jira] Commented: (NUTCH-811) Develop an ORM framework

2010-05-07 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865226#action_12865226 ] Enis Soztutar commented on NUTCH-811: - Hi Piet, The code for Gora will reside in GitHub

[jira] Commented: (NUTCH-811) Develop an ORM framework

2010-04-26 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861010#action_12861010 ] Enis Soztutar commented on NUTCH-811: - I have further developed the code, which was once

[jira] Closed: (NUTCH-808) Evaluate ORM Frameworks which support non-relational column-oriented datastores and RDBMs

2010-04-26 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar closed NUTCH-808. --- Resolution: Fixed We have decided to go on with implementing an ORM layer as per the discussion on NU

[jira] Commented: (NUTCH-811) Develop an ORM framework

2010-04-13 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856546#action_12856546 ] Enis Soztutar commented on NUTCH-811: - Actually, we plan to develop the code for this la

[jira] Created: (NUTCH-811) Develop an ORM framework

2010-04-13 Thread Enis Soztutar (JIRA)
Develop an ORM framework - Key: NUTCH-811 URL: https://issues.apache.org/jira/browse/NUTCH-811 Project: Nutch Issue Type: New Feature Reporter: Enis Soztutar Assignee: Enis Soztutar

[jira] Commented: (NUTCH-808) Evaluate ORM Frameworks which support non-relational column-oriented datastores and RDBMs

2010-04-13 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856360#action_12856360 ] Enis Soztutar commented on NUTCH-808: - bq. What do you mean by current implementation? N

[jira] Commented: (NUTCH-808) Evaluate ORM Frameworks which support non-relational column-oriented datastores and RDBMs

2010-04-12 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856124#action_12856124 ] Enis Soztutar commented on NUTCH-808: - So, this is the results so far : DataNucleus wa

[jira] Commented: (NUTCH-808) Evaluate ORM Frameworks which support non-relational column-oriented datastores and RDBMs

2010-04-02 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852840#action_12852840 ] Enis Soztutar commented on NUTCH-808: - A candidate framework is DataNucleus. It has the

[jira] Created: (NUTCH-808) Evaluate ORM Frameworks which support non-relational column-oriented datastores and RDBMs

2010-04-02 Thread Enis Soztutar (JIRA)
Evaluate ORM Frameworks which support non-relational column-oriented datastores and RDBMs -- Key: NUTCH-808 URL: https://issues.apache.org/jira/browse/NUTCH-808

[jira] Commented: (NUTCH-442) Integrate Solr/Nutch

2008-10-07 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637489#action_12637489 ] Enis Soztutar commented on NUTCH-442: - I personally believe this patch should be in befo

[jira] Resolved: (NUTCH-588) Help Need

2007-12-07 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar resolved NUTCH-588. - Resolution: Invalid Jira is not for asking questions. You should ask your questions on nutch-user

[jira] Updated: (NUTCH-586) Add option to run compiled classes w/o job file

2007-12-04 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-586: Attachment: run-core_v2.patch bq. I think you also need to put a comment, which clarifies that this

[jira] Commented: (NUTCH-586) Add option to run compiled classes w/o job file

2007-12-04 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548198 ] Enis Soztutar commented on NUTCH-586: - Can someone review this ? > Add option to run compiled classes w/o job fil

[jira] Updated: (NUTCH-586) Add option to run compiled classes w/o job file

2007-11-30 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-586: Attachment: run-core_v1.patch Attached file adds -core option to bin/nutch. > Add option to run co

[jira] Created: (NUTCH-586) Add option to run compiled classes w/o job file

2007-11-30 Thread Enis Soztutar (JIRA)
Add option to run compiled classes w/o job file --- Key: NUTCH-586 URL: https://issues.apache.org/jira/browse/NUTCH-586 Project: Nutch Issue Type: New Feature Affects Versions: 1.0.0

[jira] Created: (NUTCH-583) FeedParser empty links for items

2007-11-27 Thread Enis Soztutar (JIRA)
FeedParser empty links for items Key: NUTCH-583 URL: https://issues.apache.org/jira/browse/NUTCH-583 Project: Nutch Issue Type: Bug Affects Versions: 1.0.0 Reporter: Enis Soztutar

[jira] Commented: (NUTCH-573) Multiple Domains - Query Search

2007-11-16 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543067 ] Enis Soztutar commented on NUTCH-573: - So, how shall we proceed with this one? I give +1 to commit this, and deal

[jira] Commented: (NUTCH-573) Multiple Domains - Query Search

2007-11-14 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542449 ] Enis Soztutar commented on NUTCH-573: - @Andrzej I recall google over comma delimited syntax, but now it doesn't wo

[jira] Commented: (NUTCH-573) Multiple Domains - Query Search

2007-11-14 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542389 ] Enis Soztutar commented on NUTCH-573: - bq. Using commas is IMHO not intuitive With your respect I should disagree

[jira] Updated: (NUTCH-573) Multiple Domains - Query Search

2007-11-14 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-573: Fix Version/s: 1.0.0 Priority: Major (was: Minor) Affects Version/s: (was:

[jira] Updated: (NUTCH-573) Multiple Domains - Query Search

2007-11-13 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-573: Attachment: multiTermQuery_v1.patch Here is a patch that enables querying multiple values for the sa

[jira] Assigned: (NUTCH-573) Multiple Domains - Query Search

2007-11-13 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar reassigned NUTCH-573: --- Assignee: Enis Soztutar > Multiple Domains - Query Search > --- >

[jira] Commented: (NUTCH-574) Including inlink anchor text in index can create irrelevant search results.

2007-11-11 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541631 ] Enis Soztutar commented on NUTCH-574: - bq. Is this the type of process you were talking about with selecting most

[jira] Commented: (NUTCH-574) Including inlink anchor text in index can create irrelevant search results.

2007-11-09 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541359 ] Enis Soztutar commented on NUTCH-574: - Honestly, i don't think not indexing anchor words that do not appear in the

[jira] Commented: (NUTCH-574) Including inlink anchor text in index can create irrelevant search results.

2007-11-09 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541326 ] Enis Soztutar commented on NUTCH-574: - Why don't you just refactor indexing anchor code into another plugin, say

[jira] Commented: (NUTCH-442) Integrate Solr/Nutch

2007-10-26 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537954 ] Enis Soztutar commented on NUTCH-442: - Due to the method signature bug (http://bugs.sun.com/bugdatabase/view_bug.

[jira] Commented: (NUTCH-442) Integrate Solr/Nutch

2007-10-15 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534869 ] Enis Soztutar commented on NUTCH-442: - Using nutch with solr has been a very demanding request, so it will be very

[jira] Commented: (NUTCH-558) Need tool to retrieve domain statistics

2007-09-27 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530656 ] Enis Soztutar commented on NUTCH-558: - I wonder why you do not use URLUtils introduced in NUTCH-439. Also there is

[jira] Commented: (NUTCH-439) Top Level Domains Indexing / Scoring

2007-08-20 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521033 ] Enis Soztutar commented on NUTCH-439: - Recently Matt Cutts have written about parts of the urls : http://www.matt

[jira] Created: (NUTCH-541) Index url field untokenized

2007-08-09 Thread Enis Soztutar (JIRA)
Index url field untokenized --- Key: NUTCH-541 URL: https://issues.apache.org/jira/browse/NUTCH-541 Project: Nutch Issue Type: New Feature Components: indexer, searcher Affects Versions: 1.0.0

[jira] Commented: (NUTCH-439) Top Level Domains Indexing / Scoring

2007-07-27 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515987 ] Enis Soztutar commented on NUTCH-439: - By the way, Andrzej could you please enable support for wiki style editing

[jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring

2007-07-27 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-439: Attachment: tld_plugin_v2.3.patch bq. TLDScoringFilter contains a misspelled field, tldEnties, it sh

[jira] Commented: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining

2007-07-18 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513826 ] Enis Soztutar commented on NUTCH-518: - > I think removing initial score arguments and merging scores in > Scoring

[jira] Commented: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining

2007-07-18 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513819 ] Enis Soztutar commented on NUTCH-518: - Since there is no ordering among scoring filters, if we do something specif

[jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring

2007-07-18 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-439: Attachment: tld_plugin_v2.2.patch This patch includes "core" domain utilities and the tld plugin, bu

[jira] Commented: (NUTCH-439) Top Level Domains Indexing / Scoring

2007-07-18 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513482 ] Enis Soztutar commented on NUTCH-439: - As for Doğacan's comments I've opened issues NUTCH-518 and NUTCH-517. > T

[jira] Updated: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining

2007-07-18 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-518: Attachment: opicScoring.chain.patch Patch is attached, which was formerly a part of the patch in NUT

[jira] Created: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining

2007-07-18 Thread Enis Soztutar (JIRA)
Fix OpicScoringFilter to respect scoring filter chaining Key: NUTCH-518 URL: https://issues.apache.org/jira/browse/NUTCH-518 Project: Nutch Issue Type: Bug Components: indexe

[jira] Updated: (NUTCH-517) build encoding should be UTF-8

2007-07-18 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-517: Attachment: build.encoding.patch Patch for UTF-8 is attached > build encoding should be UTF-8 > ---

[jira] Created: (NUTCH-517) build encoding should be UTF-8

2007-07-18 Thread Enis Soztutar (JIRA)
build encoding should be UTF-8 -- Key: NUTCH-517 URL: https://issues.apache.org/jira/browse/NUTCH-517 Project: Nutch Issue Type: Bug Affects Versions: 1.0.0 Reporter: Enis Soztutar Fix

[jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring

2007-07-10 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-439: Attachment: tld_plugin_v2.1.patch Oops, it seems that i've uploaded the wrong file. This is the corr

[jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring

2007-07-10 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-439: Attachment: (was: domain.suffixes_v2.1.patch) > Top Level Domains Indexing / Scoring > -

[jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring

2007-07-10 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-439: Attachment: domain.suffixes_v2.1.patch > Very nice patch! Thanks ! > IP_PATTERN - it could be tight

[jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring

2007-07-10 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-439: Attachment: tld_plugin_v2.0.patch I have made major improvements to the code and configuration files

[jira] Commented: (NUTCH-508) ${hadoop.log.dir} and ${hadoop.log.file} are not propagated to the tasktracker

2007-07-09 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511121 ] Enis Soztutar commented on NUTCH-508: - Tasktracker invokes another jvm calling TaskTracker$Child but hadoop.log.di

[jira] Issue Comment Edited: (NUTCH-510) IndexMerger delete working dir

2007-07-09 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511043 ] Enis Soztutar edited comment on NUTCH-510 at 7/9/07 5:32 AM: - Attached patch deletes worki

[jira] Updated: (NUTCH-510) IndexMerger delete working dir

2007-07-08 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-510: Attachment: index.merger.delete.temp.dirs.patch Attached patch deletes working dirs on finally claus

[jira] Created: (NUTCH-510) IndexMerger delete working dir

2007-07-08 Thread Enis Soztutar (JIRA)
IndexMerger delete working dir -- Key: NUTCH-510 URL: https://issues.apache.org/jira/browse/NUTCH-510 Project: Nutch Issue Type: Improvement Components: indexer Affects Versions: 1.0.0 Re

[jira] Commented: (NUTCH-501) implementing a different caching mechanism for objects

2007-06-18 Thread Enis Soztutar (JIRA)
Implement a different caching mechanism for objects cached in configuration In-Reply-To: <[EMAIL PROTECTED]> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable [ https://issues.apache.org/jira/browse/NUTCH-501?page=3Dcom.atlassian.= jira.plu

[jira] Commented: (NUTCH-498) Use Combiner in LinkDb to increase speed of linkdb generation

2007-06-15 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505079 ] Enis Soztutar commented on NUTCH-498: - I think you may not want {code} reporter.incrCounter(Counters.COMBINED, c

[jira] Updated: (NUTCH-471) Fix synchronization in NutchBean creation

2007-04-27 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-471: Attachment: NutchBeanCreationSync_v2.patch >From http://www-128.ibm.com/developerworks/java/library/

[jira] Commented: (NUTCH-475) Adaptive crawl delay

2007-04-25 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491882 ] Enis Soztutar commented on NUTCH-475: - we can use a formula like : delay = alpha * delay + (1 - alpha) * (k * t)

[jira] Commented: (NUTCH-471) Fix synchronization in NutchBean creation

2007-04-24 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491313 ] Enis Soztutar commented on NUTCH-471: - > Nice trick with the unsynchronized check. :) Wow, indeed i have used a pa

[jira] Updated: (NUTCH-471) Fix synchronization in NutchBean creation

2007-04-24 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-471: Attachment: NutchBeanCreationSync_v1.patch this patch synchronizes NutchBean.get((ServletContext app

[jira] Created: (NUTCH-471) Fix synchronization in NutchBean creation

2007-04-24 Thread Enis Soztutar (JIRA)
Fix synchronization in NutchBean creation - Key: NUTCH-471 URL: https://issues.apache.org/jira/browse/NUTCH-471 Project: Nutch Issue Type: Bug Components: searcher Affects Versions: 1.0.0

[jira] Commented: (NUTCH-466) Flexible segment format

2007-04-02 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12485996 ] Enis Soztutar commented on NUTCH-466: - >> There may be many parts that use the same key/value classes in MapFiles.

[jira] Commented: (NUTCH-466) Flexible segment format

2007-04-02 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12485977 ] Enis Soztutar commented on NUTCH-466: - This patch will indeed resolve many issues related to storing extra informa

[jira] Commented: (NUTCH-464) Commandline Search

2007-03-27 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12484447 ] Enis Soztutar commented on NUTCH-464: - Opening an issue to ask for help is not a good practice. you should instead

[jira] Commented: (NUTCH-455) dedup on tokenized fields is faulty

2007-03-08 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12479262 ] Enis Soztutar commented on NUTCH-455: - (from LUCENE-252) In nutch we have 3 options : 1st is to disallow deleting

[jira] Updated: (NUTCH-455) dedup on tokenized fields is faulty

2007-03-07 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-455: Attachment: IndexSearcherCacheWarm.patch the patch to the IndexSearcher is attached > dedup on toke

[jira] Created: (NUTCH-455) dedup on tokenized fields is faulty

2007-03-07 Thread Enis Soztutar (JIRA)
dedup on tokenized fields is faulty --- Key: NUTCH-455 URL: https://issues.apache.org/jira/browse/NUTCH-455 Project: Nutch Issue Type: Bug Components: searcher Affects Versions: 0.9.0

[jira] Updated: (NUTCH-445) Domain İndexing / Query Filter

2007-02-28 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-445: Attachment: index_query_domain_v1.2.patch This patch is an update of the previous three patches. Th

[jira] Commented: (NUTCH-445) Domain İndexing / Query Filter

2007-02-28 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12476550 ] Enis Soztutar commented on NUTCH-445: - Well, indeed the current two patches TranslatingRawFieldQueryFilter_v1.0.p

[jira] Updated: (NUTCH-445) Domain İndexing / Query Filter

2007-02-15 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-445: Attachment: index_query_domain_v1.1.patch This patch fixes the raw field name bug in v1.0 and adds t

[jira] Updated: (NUTCH-445) Domain İndexing / Query Filter

2007-02-15 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-445: Attachment: TranslatingRawFieldQueryFilter_v1.0.patch This patch complements index_query_domain_v1.0

[jira] Updated: (NUTCH-445) Domain İndexing / Query Filter

2007-02-15 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-445: Attachment: index_query_domain_v1.0.patch Patch for index-domain and query-domain plugins. > Domai

[jira] Created: (NUTCH-445) Domain İndexing / Query Filter

2007-02-15 Thread Enis Soztutar (JIRA)
Domain İndexing / Query Filter -- Key: NUTCH-445 URL: https://issues.apache.org/jira/browse/NUTCH-445 Project: Nutch Issue Type: New Feature Components: indexer, searcher Affects Versions: 0.9.0

[jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring

2007-02-07 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-439: Attachment: tld_plugin_v1.1.patch I have forgotten to unset http.agent.name in the v1.0 accidentally

[jira] Updated: (NUTCH-439) Top Level Domains Indexing / Scoring

2007-02-06 Thread Enis Soztutar (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated NUTCH-439: Attachment: tld_plugin_v1.0.patch This is a plugin implementation for indexing and scoring top level

[jira] Created: (NUTCH-439) Top Level Domains Indexing / Scoring

2007-02-06 Thread Enis Soztutar (JIRA)
Top Level Domains Indexing / Scoring Key: NUTCH-439 URL: https://issues.apache.org/jira/browse/NUTCH-439 Project: Nutch Issue Type: New Feature Components: indexer Affects Versions: 0.9.0

[jira] Updated: (NUTCH-251) Administration GUI

2006-11-23 Thread Enis Soztutar (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-251?page=all ] Enis Soztutar updated NUTCH-251: Attachment: Nutch-251-AdminGUI.tar.gz I have updated the patch written by stephan. This version works with Nutch-0.9-dev and hadoop-0.7.1 (current version of nu

[jira] Updated: (NUTCH-289) CrawlDatum should store IP address

2006-11-16 Thread Enis Soztutar (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-289?page=all ] Enis Soztutar updated NUTCH-289: Attachment: ipInCrawlDatumDraftV5.1.patch The version 5 patch does not run on the current build. So i have fixed it and resend the patch(did not changed any cod

[jira] Commented: (NUTCH-393) Indexer doesn't handle null documents returned by filters

2006-11-07 Thread Enis Soztutar (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-393?page=comments#action_12447787 ] Enis Soztutar commented on NUTCH-393: - Also IndexingException is catched by the Indexer, in which case the whole document is not added to the writer (the funct

[jira] Updated: (NUTCH-389) a url tokenizer implementation for tokenizing index fields : url and host

2006-11-07 Thread Enis Soztutar (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-389?page=all ] Enis Soztutar updated NUTCH-389: Attachment: urlTokenizer-improved.diff This is an improvement and a minor bug fix over the previous url tokenizer. This version first replaces characters, which

[jira] Commented: (NUTCH-389) a url tokenizer implementation for tokenizing index fields : url and host

2006-10-30 Thread Enis Soztutar (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-389?page=comments#action_12445512 ] Enis Soztutar commented on NUTCH-389: - Otis you can test the tokenizer using the TestUrlTokenizer junit test case. And you cab test the NutchDocumentTokenizer b

[jira] Updated: (NUTCH-389) a url tokenizer implementation for tokenizing index fields : url and host

2006-10-20 Thread Enis Soztutar (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-389?page=all ] Enis Soztutar updated NUTCH-389: Description: NutchAnalysis.jj tokenizes the input by threating & and _ as non token seperators, which is in the case of the urls not appropriate. So i have writ

[jira] Updated: (NUTCH-389) a url tokenizer implementation for tokenizing index fields : url and host

2006-10-20 Thread Enis Soztutar (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-389?page=all ] Enis Soztutar updated NUTCH-389: Attachment: urlTokenizer.diff patch for url tokenization > a url tokenizer implementation for tokenizing index fields : url and host > -

[jira] Created: (NUTCH-389) a url tokenizer implementation for tokenizing index fields : url and host

2006-10-20 Thread Enis Soztutar (JIRA)
a url tokenizer implementation for tokenizing index fields : url and host -- Key: NUTCH-389 URL: http://issues.apache.org/jira/browse/NUTCH-389 Project: Nutch Issue Typ

[jira] Commented: (NUTCH-356) Plugin repository cache can lead to memory leak

2006-08-30 Thread Enis Soztutar (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-356?page=comments#action_12431548 ] Enis Soztutar commented on NUTCH-356: - I observed strange behaviour, when one of the plug-ins could not be included. For example the plugin system fails to load