Hi Markus,

On Wed, Sep 10, 2014 at 10:28 PM, <[email protected]> wrote:

>
> Weird, i didn't see my own mail arriving on the list, i sent it via kmail
> but am on webmail now, which seems to work.


sigh ;)


> Anyway, for vertical search on a whole website i would rely on your
> (customized) Lucene similarity and proper analysis, but also downgrading
> `bad` pages for which you can make custom classifier plugins in Nutch.


Yep, this sounds much more appropriate for the task at hand. I have
debugged the Webgraph code as well as some of the tools within this
environment... it is not an apple-for-apple fit for what I am trying to
achieve.


> That way you can, for example, get rid of hub pages and promote actual
> content.
>
>
Yeah. I understand.


>
> Anyway, it all depends on what you want to achieve, which is....? :)
>


   - Networks. Specifically, domain specific networks...
   - how they are formed and where they come from.
   - Where the traffic comes from (by server host, server IP, client IP and
   by content relevance)
   - what the graph looks like within these domain specific, networks. By
   the way, within this context, I think that a dense graph is probably OK. I
   am looking for this actually.

Reply via email to