Hi Talat,

Moving to dev@, there is nothing contained that should not be public.

On Mon, Nov 3, 2014 at 9:36 AM, <[email protected]>
wrote:

>
> First of all I want to share my update status. I started to work on
> Nutch again for another Turkish Company as a consultant . :) I am very
> happy about that.
>

Glad to hear this Talat. I apologize about length of time its taken me to
answer here.


>
> I want to ask some question to you. Since I started using Nutch, we do
> not make announcement about new release for 2.x
> 2.3 has some issues on Jira.


Yes here *http://s.apache.org/WAF <http://s.apache.org/WAF>*



> If we solve that, will 2.3 ready to become RC ?
>

Yes


>
> Secondly I wonder What do you think about creating a roadmap for 2.x ?
>

Frigging great idea.
We REALLY need to clean up the following
http://wiki.apache.org/nutch/Nutch2Roadmap


> If you agree this. We will create and share on Nutch wiki.


Please see above


> New
> contrubiter or users know deficiency of Nutch 2.x.


Yes, I am hoping that you guys are going to make the benefits of Nutch 1.X
and 2.X explicit at ApacheCon EU. I tried to do this in Denver as much as
possible this year as well and we've had some good discussion on the
community lists which explain them.


> And We can arrange
> our new version for the roadmap.
>

Great


>
> From my mind:
>
> - Hadoop 2.x support (This depends to Gora)
>

It sure does. It also depends on us trying and testing it out and making
sure that as much is done to make the transition smooth. I am +1


> - Giraph support. (Emre Aladag developed a solution. But it is very
> awkward. In AGMLAB we used that.)
>

Huge +1 for this. I tried to work with Emre but unfortunately he got his
GSoC proposal in too late. Never mind though as the past is behind us. Emre
did however post some patches. We need to get those patches and take
advantage of them where we can. If we could re-engage Emre then all the
better as he has a good understanding of the work and I was disappointed to
not have been able to work with him.


> - Sitemap Support (We try to solve this In AGMLab with Alparslan Avci)
>

Not only did Alparslan submit a patch, AFAIK he also provided us with
several  UML-style graphics for communicating his ideas for this work
within the context of the Nutch 2.X architecture. I am +1.


> - HTML5 support
>

This needs to be more specific. I am not sure what you mean. IMHO this is
work which needs to happen mostly up in Tika.


> - RDF Microformats Support
>

This should happen in Any23. Microformats and a whole host of other things.
It would like to begin work on the Any23 plugin which would write to a
triples stores as suggested by Seb as oppose to adding triples to a
Metadata field which is how I initially wrote the plugin.


> - Static Snippet Generation
>

More detail required.


> - Sentences Detection and Named Entity Recognize
>

This could be the result of implementing a number of configurable plugins.
The scope right now is up to whoever wants to begin work on this.


>
> Do not hesitate adding anything above. I wonder your comments.
>
>
What I would ask is that you email me your Wiki name and please add the
above to the wiki page I referenced at the beginning of my response. We can
take the 2.4 release from there and hopefully release with some of these
features included.
Thanks Talat for your vision it is very positive.
Lewis

Reply via email to