Hi,

I want to index logs files let say *type_a.log* & *type_b.log* and all this
files come from various host. Now in normal case I would index like
<some_field: "" , from_host : "" , log_type :" "> . [Index Everything as
single flat indexer]

Now issue is let say each file has 1 billion rows, so with 2 host each with
2 files, it will have 4 billion rows. So when some one want all details for
particular host search space still will be 4 billion rows. As lucene is
flat it indexes this way only. So is there a way I can create some metadata
of details like type of log, hostname and based on that create an index so
search become faster (when we need details from particular host or log
files only) or is this bad idea as when some join query ( give data from
host A and B and show in descending order) comes it will be very slow?

Hope you got my question.

Regards,
Archit

Reply via email to