Fast grep on hdfs files

Jean Fri, 19 Dec 2014 14:02:45 -0800

Hello,
I want to be able to grep customs strings in lot of files stored in hdfs.
I have at least a size of 500GB-2TB to grep splitted in ~50-200 files.


What would be the best way to have the faster results : 
- lines matching 
- filenames containing the lines matched

I tested with a map reduce grep but it's slow for interactive user.

Do i need to index  everything in hive,solr ?
Spark will be faster than mapreduce ?


Thanks

Fast grep on hdfs files

Reply via email to