Thanks.Indeed I am indexing each file. But how do I index each line of a file. This will essentially mean-> First I need to index each file to know whether the word exist or not. Then I need to index each line of the file to know them location. This does not seem to be a problem.

Problem is If I specify the file name to index, the file will be indexed. If i specify the directory name, all the file inside that directory will be indexed. But how do I go about indexing each line of a file.

Does this mean, get each line in file and feed it to lucene so that indexes can be generated. This will be very resource extensive as well as severly hit performance issue.

On 7/4/2013 2:04 PM, Ian Lea wrote:
Sounds like you're indexing each log file as one lucene document.
Obvious answer is to index each line in each log file as a separate
doc.  Searches would then match lines in files and you can display
those lines, summarizing counts per file if you want that,

If you wanted to be able to show surrounding lines, index the line
number and the file name.  So if you got a hit on line 12345 of file
logabc.txt you could execute a second search with logfilename:
logabc.txt AND lineno:[12340 TO 12350] to get 5 lines either side.
Use a NumericField and NumericRangeQuery for lineno if you are
concerned about performance.  See recent thread on this list for more
on that.


--
Ian.


On Thu, Jul 4, 2013 at 8:10 AM, Ankit Murarka
<ankit.mura...@rancoretech.com>  wrote:
Dear Team,
                  I have a potential usecase. I have large number of log
files which are archived in a particular directory. Now the administrator
would like to view certain information which might/might not be present in
any of the files inside the directory.

Using lucene, I was able to get whether the specific word he is searching
for is present in the files or not and in which files they are present.

BUT, is it possible to find the location of that word inside the file. Each
file is about 5 MB and does not really make sense to parse the file to know
the location of a certain word which is present.

Can lucene help in this regard? Or atleast a close approximation of its
location in the file. I would be wishing to show atleast 256KB of data from
the point that word is present int he file.

Googled a lot but to no avail.

--
Regards

Ankit

"Peace is found not in what surrounds us, but in what we hold within."


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



--
Regards

Ankit Murarka

"Peace is found not in what surrounds us, but in what we hold within."


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to