Re: Performance for hive external to hbase with serval terabyte or more data

2016-05-12 Thread Yi Jiang
Hi Jorn Thank you for replying. We are currently exporting data from hbase to hive, I have mentioned in the previous message. I am working in the big company. I personally like tez but it's even not in our roadmap. Thank you On May 12, 2016, at 1:52 AM, J?rn Franke

Re: Performance for hive external to hbase with serval terabyte or more data

2016-05-11 Thread Jörn Franke
Why don't you export the data from hbase to hive, eg in Orc format. You should not use mr with Hive, but Tez. Also use a recent hive version (at least 1.2). You can then do queries there. For large log file processing in real time, one alternative depending on your needs could be Solr on

Re: Performance for hive external to hbase with serval terabyte or more data

2016-05-11 Thread Sathi Chowdhury
Hi Yang, Did you think of bulk loading option? http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/ This may be a way to go . Thanks Sathi On May 11, 2016, at 6:07 PM, Yi Jiang > wrote: Hi, Guys Recently we are debating

Performance for hive external to hbase with serval terabyte or more data

2016-05-11 Thread Yi Jiang
Hi, Guys Recently we are debating the usage for hbase as our destination for data pipeline job. Basically, we want to save our logs into hbase, and our pipeline can generate 2-4 terabytes data everyday, but our IT department think it is not good idea to scan so hbase, it will cause the