Hello, I want to be able to grep customs strings in lot of files stored in hdfs. I have at least a size of 500GB-2TB to grep splitted in ~50-200 files.
What would be the best way to have the faster results : - lines matching - filenames containing the lines matched I tested with a map reduce grep but it's slow for interactive user. Do i need to index everything in hive,solr ? Spark will be faster than mapreduce ? Thanks
