[ANNOUNCE] Apache Gora 0.8 Release

2017-09-20 Thread lewis john mcgibbney
Hi Folks, The Apache Gora team are pleased to announce the immediate availability of Apache Gora 0.8. The Apache Gora open source framework provides an in-memory data model and persistence for big data. Gora supports persisting to - column stores, - key value stores, - document stores,

RE: [EXT] Re: Nutch Plugin Lifecycle broken due to lazy loading?

2017-09-20 Thread Yossi Tamari
Hi Hiran, I recently needed the documents you requested myself, and the two below were the most helpful. Keep in mind that like most Nutch documentation, they are not totally up to date, so you need to be a bit flexible. The most important difference for me was getting the source from GitHub

RE: [EXT] Re: Nutch Plugin Lifecycle broken due to lazy loading?

2017-09-20 Thread Hiran CHAUDHURI
>> When you look at the protocol-smb hook it comes with this static hook, >> but as it is never executed does not help. > >Yes, it has to be called. So when would Nutch call this static hook? In practice this does not happen before the plugin is required, but then it is too late as the

Re: depth scoring filter

2017-09-20 Thread Jigal van Hemert | alterNET internet BV
Hi, On 20 September 2017 at 06:36, Michael Coffey wrote: > I am trying do develop a news crawler and I want to prohibit it from > wandering too far away from the seed list that I provide. > It seems like I should use the DepthScoringFilter, but I am having trouble >