Hello,
I want to be able to grep customs strings in lot of files stored in hdfs.
I have at least a size of 500GB-2TB to grep splitted in ~50-200 files.

What would be the best way to have the faster results : 
- lines matching 
- filenames containing the lines matched

I tested with a map reduce grep but it's slow for interactive user.

Do i need to index  everything in hive,solr ?
Spark will be faster than mapreduce ?


Thanks

Reply via email to