[jira] [Updated] (NUTCH-1570) Add filtering capability to Datastore Queries

2014-05-01 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1570: Fix Version/s: (was: 2.4) 2.3 Add filtering capability to

[jira] [Updated] (NUTCH-1410) impact of a map-reduce problem

2014-05-01 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1410: Fix Version/s: (was: 2.4) 2.3 impact of a map-reduce

[jira] [Closed] (NUTCH-1410) impact of a map-reduce problem

2014-05-01 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-1410. --- impact of a map-reduce problem -- Key:

[jira] [Resolved] (NUTCH-1490) Data Truncation exceptions when using mysql

2014-05-01 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1490. - Resolution: Won't Fix gora-sql not in use right now Data Truncation exceptions

[jira] [Closed] (NUTCH-1490) Data Truncation exceptions when using mysql

2014-05-01 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-1490. --- Data Truncation exceptions when using mysql

[jira] [Resolved] (NUTCH-1497) Better default gora-sql-mapping.xml with larger field sizes for MySQL

2014-05-01 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1497. - Resolution: Won't Fix gora-sql not in use right now Better default

[jira] [Updated] (NUTCH-1674) Use batchId filter to enable scan (GORA-119) for Fetch,Parse,Update,Index

2014-05-01 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1674: Fix Version/s: (was: 2.4) 2.3 Use batchId filter to enable

[jira] [Updated] (NUTCH-1714) Nutch 2.x upgrade to Gora 0.4

2014-05-01 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1714: Summary: Nutch 2.x upgrade to Gora 0.4 (was: Nutch 2.x upgrade to use GORA_94

[jira] [Updated] (NUTCH-1301) Index job resume switch to resume a failed job

2014-05-01 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1301: Fix Version/s: (was: 2.3) 2.4 Index job resume switch to

Re: [DISCUSS] Roadmap for 2.3 Release

2014-05-01 Thread Lewis John Mcgibbney
Hi Alparslan Folks, OK so you can see the road map's here *http://s.apache.org/Xqk* http://s.apache.org/Xqk As you can see in 2.3 development drive we've addressed 66 of 71 issues. The remainders being as follows NUTCH-1741 https://issues.apache.org/jira/browse/NUTCH-1741 Support of Sitemaps

[jira] [Commented] (NUTCH-1714) Nutch 2.x upgrade to Gora 0.4

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986402#comment-13986402 ] Julien Nioche commented on NUTCH-1714: -- Hi [~lewismc] Re-progression update : I

[jira] [Commented] (NUTCH-1714) Nutch 2.x upgrade to Gora 0.4

2014-05-01 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986404#comment-13986404 ] Lewis John McGibbney commented on NUTCH-1714: - Looks like we have a couple of

Re: [DISCUSS] Roadmap for 2.3 Release

2014-05-01 Thread Julien Nioche
I'd exclude NUTCH-1741 for now and focus on the core updates (GORA, filters, etc...). See comments on NUTCH-1714https://issues.apache.org/jira/browse/NUTCH-1714 On 1 May 2014 07:27, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi Alparslan Folks, OK so you can see the road map's

Re: [DISCUSS] Roadmap for 2.3 Release

2014-05-01 Thread Talat Uyarer
I aggree with you Julien. Today Lewis change some issues's fix version 2.3 to 2.4. Most of my issues :) May I ask, If I update these issues, can I change fix version to 2.3 ? I need them. Thanks Talat 2014-05-01 9:47 GMT+03:00 Julien Nioche lists.digitalpeb...@gmail.com: I'd exclude

Re: [DISCUSS] Roadmap for 2.3 Release

2014-05-01 Thread Julien Nioche
Hi Talat Not clear what you mean here. I need them is not really an explanation as to why they should be part of the next release. [If you want your own repository then open an account on GitHub (or somewhere else) and clone the 2.x branch to add the patches of your choice]. Lewis suggested a

Re: [DISCUSS] Roadmap for 2.3 Release

2014-05-01 Thread Talat Uyarer
Hi Julien, Sorry, You are right. I guess I could not express myself. I want to say some of the issues which are appointed to the 2.4, should be part of the 2.3. The issues: NUTCH-1753 Eclipse dependecy problem for 2.x NUTCH-1748 urlfilter-validator to allow .. (two dots) inside file names (path

[jira] [Commented] (NUTCH-1753) Eclipse dependecy problem for 2.x

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986698#comment-13986698 ] Julien Nioche commented on NUTCH-1753: -- It won't do any harm to do it the way you are

[jira] [Resolved] (NUTCH-1740) BatchId parameter is not set in DbUpdaterJob

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche resolved NUTCH-1740. -- Resolution: Duplicate BatchId parameter is not set in DbUpdaterJob

[jira] [Updated] (NUTCH-1679) UpdateDb using batchId, link may override crawled page.

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1679: - Fix Version/s: (was: 2.4) 2.3 UpdateDb using batchId, link may override

[jira] [Updated] (NUTCH-1679) UpdateDb using batchId, link may override crawled page.

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1679: - Affects Version/s: (was: 2.3) 2.2.1 UpdateDb using batchId, link may

[jira] [Updated] (NUTCH-1728) indexer-solr plugin is not delete docs from solr

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1728: - Fix Version/s: 2.3 indexer-solr plugin is not delete docs from solr

[jira] [Commented] (NUTCH-1728) indexer-solr plugin is not delete docs from solr

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986713#comment-13986713 ] Julien Nioche commented on NUTCH-1728: -- +1 to commit indexer-solr plugin is not

[jira] [Updated] (NUTCH-1725) CleaningJob's reducer does not commit deleted docs.

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1725: - Fix Version/s: 2.3 CleaningJob's reducer does not commit deleted docs.

[jira] [Commented] (NUTCH-1725) CleaningJob's reducer does not commit deleted docs.

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986717#comment-13986717 ] Julien Nioche commented on NUTCH-1725: -- +1 to commit CleaningJob's reducer does not

[jira] [Updated] (NUTCH-1662) Indexer Plugin for Solr Cloud

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1662: - Affects Version/s: (was: 2.3) 2.2.1 Indexer Plugin for Solr Cloud

[jira] [Commented] (NUTCH-1662) Indexer Plugin for Solr Cloud

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986721#comment-13986721 ] Julien Nioche commented on NUTCH-1662: -- I think we did something pretty similar in

Re: [DISCUSS] Roadmap for 2.3 Release

2014-05-01 Thread Julien Nioche
Hi Talat, Comments below : NUTCH-1753 Eclipse dependecy problem for 2.x = trivial, please see my comments on it NUTCH-1748 urlfilter-validator to allow .. (two dots) inside file names (path elements) = still under discussion - leave it for 2.4 NUTCH-1740 BatchId parameter is not set

[jira] [Commented] (NUTCH-1657) ORIGINAL_CHAR_ENCODING and CHAR_ENCODING_FOR_CONVERSION never set in HTMLParser

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986726#comment-13986726 ] Julien Nioche commented on NUTCH-1657: -- +1 thanks! ORIGINAL_CHAR_ENCODING and

[jira] [Updated] (NUTCH-1618) Fetches some websites multiple times for long lasting queues

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1618: - Fix Version/s: (was: 2.4) 2.3 Fetches some websites multiple times for

[jira] [Updated] (NUTCH-1657) ORIGINAL_CHAR_ENCODING and CHAR_ENCODING_FOR_CONVERSION never set in HTMLParser

2014-05-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1657: - Fix Version/s: (was: 2.4) 2.3 ORIGINAL_CHAR_ENCODING and

[jira] [Commented] (NUTCH-1768) port NUTCH-1745 to Nutch 2.x (Upgrade to ElasticSearch 1.1.0)

2014-05-01 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986943#comment-13986943 ] Rogério Pereira Araújo commented on NUTCH-1768: --- Tried to apply this patch

[jira] [Updated] (NUTCH-1714) Nutch 2.x upgrade to Gora 0.4

2014-05-01 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alparslan Avcı updated NUTCH-1714: -- Attachment: NUTCH-1714v5.patch Hi [~jnioche], I have uploaded a new patch that also fixes the