Thank you Julien. I agree with you about we should do as robust as possible
releases. I work on your comments.

Talat


2014-05-01 19:32 GMT+03:00 Julien Nioche <[email protected]>:

> Hi Talat,
>
> Comments below :
>
>  NUTCH-1753 Eclipse dependecy problem for 2.x
>>
>
> => trivial, please see my comments on it
>
>
>> NUTCH-1748 urlfilter-validator to allow .. (two dots) inside file names
>> (path elements)
>>
>
> => still under discussion - leave it for 2.4
>
>
>> NUTCH-1740 BatchId parameter is not set in DbUpdaterJob
>>
>
> => duplicate
>
>
>> NUTCH-1728 indexer-solr plugin is not delete docs from solr
>>
>
> => trivial enough to be committed for 2.3
>
>
>> NUTCH-1725 CleaningJob's reducer does not commit deleted docs.
>>
>
> => trivial enough to be committed for 2.3
>
>
>> NUTCH-1662 NUTCH-1568 Indexer Plugin for Solr Cloud
>>
>
> => I think we did something pretty similar in 1.x and would like to make
> sure that both versions are as similar as possible.
>
>
>>  NUTCH-1661 Language based crawling
>>
>
> => This is definitely not being committed. You haven't replied to Otis's
> questions and this has to be properly reviewed first and discussed.
>
>
>> NUTCH-1660 Index filter for Page's latitude and longitude
>>
>
> => same. You haven't replied to the comments on this one.
>
>
>> NUTCH-1657 ORIGINAL_CHAR_ENCODING and CHAR_ENCODING_FOR_CONVERSION never
>> set in HTMLParser
>>
>
> => trivial indeed, +1 thanks
>
>
>> NUTCH-1643 Unnecessary fetching with http.content.limit when using
>> protocol-http
>>
>
> => needs reviewing first, let's leave it for later
>
>
>> NUTCH-1618 Fetches some websites multiple times for long lasting queues
>>
>
> => trivial indeed, please change the title to something more explicit like
> "Turn speculative execution off for Fetching"
>
> I have added NUTCH-1679 <https://issues.apache.org/jira/browse/NUTCH-1679>
>  (UpdateDb using batchId, link may override crawled page.) to 2.3 as it
> must be fixed ASAP.
>
> Thanks for pointing out these issues. I think the focus for 2.3 should be
> to get everything as robust as possible, we can always add new
> functionalities in another release after that ("release often" etc...). One
> thing we should definitely have though is to leverage the brand new GORA
> filtering so that we get only the entries marked for a given job - see
> discussion on NUTCH-1714. This should make Nutch 2.x a lot faster.
>
> We haven't released 2.x for some time and loads of interesting stuff has
> been done to it. It will be an exciting release!
>
> Thanks for your contributions and pushing things forward!
>
> Julien
>
>
>
>>
>> 2014-05-01 11:32 GMT+03:00 Julien Nioche <[email protected]>:
>>
>> Hi Talat
>>>
>>> Not clear what you mean here. "I need them" is not really an explanation
>>> as to why they should be part of the next release. [If you want your own
>>> repository then open an account on GitHub (or somewhere else) and clone the
>>> 2.x branch to add the patches of your choice].
>>>
>>> Lewis suggested a roadmap for the next release and the changes he made
>>> reflect his suggestions. If you think some of the issues should be part of
>>> the 2.3 release then please explain why. BTW I don't think you agree with
>>> me as I was suggesting we stick to the ones already listed minus 1741.
>>>
>>> Thanks
>>>
>>> Julien
>>>
>>>
>>>
>>> On 1 May 2014 08:40, Talat Uyarer <[email protected]> wrote:
>>>
>>>> I aggree with you Julien. Today Lewis change some issues's fix version
>>>>  2.3 to 2.4. Most of my issues :) May I ask, If I update these issues, can
>>>> I change fix version to 2.3  ? I need them.
>>>>
>>>> Thanks
>>>> Talat
>>>>
>>>>
>>>> 2014-05-01 9:47 GMT+03:00 Julien Nioche <[email protected]>
>>>> :
>>>>
>>>> I'd exclude NUTCH-1741 for now and focus on the core updates (GORA,
>>>>> filters, etc...). See comments on 
>>>>> NUTCH-1714<https://issues.apache.org/jira/browse/NUTCH-1714>
>>>>>
>>>>>
>>>>> On 1 May 2014 07:27, Lewis John Mcgibbney 
>>>>> <[email protected]>wrote:
>>>>>
>>>>>> Hi Alparslan & Folks,
>>>>>>
>>>>>> OK so you can see the road map's here
>>>>>>
>>>>>> *http://s.apache.org/Xqk* <http://s.apache.org/Xqk>
>>>>>>
>>>>>> As you can see in 2.3 development drive we've addressed 66 of 71
>>>>>> issues. The remainders being as follows
>>>>>>
>>>>>> NUTCH-1741 <https://issues.apache.org/jira/browse/NUTCH-1741>
>>>>>>
>>>>>> Support of Sitemaps in Nutch 
>>>>>> 2.x<https://issues.apache.org/jira/browse/NUTCH-1741>
>>>>>> NUTCH-1714 <https://issues.apache.org/jira/browse/NUTCH-1714>
>>>>>>
>>>>>> Nutch 2.x upgrade to Gora 
>>>>>> 0.4<https://issues.apache.org/jira/browse/NUTCH-1714>
>>>>>> NUTCH-1709 <https://issues.apache.org/jira/browse/NUTCH-1709>
>>>>>>
>>>>>> Generated classes o.a.n.storage.Host and o.a.n.storage.ProtocolStatus
>>>>>> contain methods not defined in source 
>>>>>> .avsc<https://issues.apache.org/jira/browse/NUTCH-1709>
>>>>>> NUTCH-1674 <https://issues.apache.org/jira/browse/NUTCH-1674>
>>>>>>
>>>>>> Use batchId filter to enable scan (GORA-119) for
>>>>>> Fetch,Parse,Update,Index<https://issues.apache.org/jira/browse/NUTCH-1674>
>>>>>>  NUTCH-1570 <https://issues.apache.org/jira/browse/NUTCH-1570>
>>>>>>
>>>>>> Add filtering capability to Datastore 
>>>>>> Queries<https://issues.apache.org/jira/browse/NUTCH-1570>
>>>>>> I think if we addressed the above then we could push an RC.
>>>>>> Any comments?
>>>>>> I'll be able to crack on with this final push relatively soon.
>>>>>>
>>>>>> On Tue, Apr 29, 2014 at 1:09 PM, <[email protected]>wrote:
>>>>>>
>>>>>>>
>>>>>>> I think we can also add
>>>>>>> https://issues.apache.org/jira/browse/NUTCH-1674. This issue was
>>>>>>> waiting the stable release of gora-0.4.
>>>>>>>
>>>>>>> And IMHO, we can add
>>>>>>> https://issues.apache.org/jira/browse/NUTCH-1741, if anyone could
>>>>>>> review and test it.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Alparslan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Open Source Solutions for Text Engineering
>>>>>
>>>>> http://digitalpebble.blogspot.com/
>>>>> http://www.digitalpebble.com
>>>>> http://twitter.com/digitalpebble
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Talat UYARER
>>>> Websitesi: http://talat.uyarer.com
>>>> Twitter: http://twitter.com/talatuyarer
>>>> Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Open Source Solutions for Text Engineering
>>>
>>> http://digitalpebble.blogspot.com/
>>> http://www.digitalpebble.com
>>> http://twitter.com/digitalpebble
>>>
>>
>>
>>
>> --
>> Talat UYARER
>> Websitesi: http://talat.uyarer.com
>> Twitter: http://twitter.com/talatuyarer
>> Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304
>>
>
>
>
> --
>
> Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
> http://twitter.com/digitalpebble
>



-- 
Talat UYARER
Websitesi: http://talat.uyarer.com
Twitter: http://twitter.com/talatuyarer
Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

Reply via email to