Apache Drill can read from Amazon S3 buckets. While the performance on S3 might not be as good as on HDFS/MapR-FS, I would expect it work.
Have you tried using Drill here and hit any issue? On Thu, Jul 21, 2016 at 1:59 PM, Suma Cherukuri <[email protected] > wrote: > Hi, > > Good Afternoon! > > I work as an engineer at Symantec. My team works on Multi-tenant Event > Processing System. Just a high level background, our customers write data > to kafka brokers though agents like logstash and we process the events and > save the log data in Elastic Search and S3. > > Use Case: We have a use case where in we write batches of events to S3 > when file size limitation of 1MB (specific to our case) or a certain time > threshold is reached. We are planning on merging the number of files > specific to a folder into one single file based on either time limit such > as every 24 hrs. > > We were considering various options available today and would like to > know if Apache Drill can be used to serve the purpose. > > Looking forward to hearing from you. > > Thank you > Suma Cherukuri > >
