age-
From: Mario Alejandro M. [mailto:[EMAIL PROTECTED]
Sent: 23 January 2006 15:58
To: Otis Gospodnetic
Cc: [email protected]
Subject: Re: Indexing Urls pointing to same content
I know Lucene is not a web indexer... maybe I explain this bad.
I'm asking in how STORE the data, not in ho
I know Lucene is not a web indexer... maybe I explain this bad.
I'm asking in how STORE the data, not in how locate it. If two files are the
same, using MD5 is my actual approach, then I plan to STORE the content once
but is necesary add the two locations.
Example:
c:\file1 Content: One
c:\file2
ri 20 Jan 2006 05:27:01 PM EST
Subject: Indexing Urls pointing to same content
I found that in the data I'm searching I have a lot of duplicated content.
Only diference is that the url change, ie, one say
http://localhost/sample.html and the other http://localhost/sample2.html.
However, sample1
I found that in the data I'm searching I have a lot of duplicated content.
Only diference is that the url change, ie, one say
http://localhost/sample.html and the other http://localhost/sample2.html.
However, sample1 and sample2 are diferent files, that its, here is not
involved redirection or link