subject:"\"\\\[DISCUSS\\\] Columnar storage engine for Apache Kylin\""

Re:Re: Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-10-31 Thread Ma Gang

Hi ShaoFeng, Very good questions, please see my comments start with [Gang]: 1) How to bridge the real-time cube with a cube built from Hive? You know, in Kylin the source type is marked at the table level, which means a table is either a Hive table, a JDBC table or a streaming table. To implement

Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-10-26 Thread JiaTao Tao

You are welcome, ShaoFeng! Storage and query engine are inseparable and should design together for fully gaining each other's abilities. And I'm very excited about the new coming columnar storage and query engine! -- Regards! Aron Tao ShaoFeng Shi 于2018年10月26日周五下午10:28写道： > Exactly; Than

Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-10-26 Thread ShaoFeng Shi

Exactly; Thank you jiatao for the comments! JiaTao Tao 于2018年10月25日周四下午6:12写道： > As far as I'm concerned, using Parquet as Kylin's storage format is pretty > appropriate. From the aspect of integrating Spark, Spark made a lot of > optimizations for Parquet, e.g. We can enjoy Spark's vectorized

Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-10-25 Thread JiaTao Tao

As far as I'm concerned, using Parquet as Kylin's storage format is pretty appropriate. From the aspect of integrating Spark, Spark made a lot of optimizations for Parquet, e.g. We can enjoy Spark's vectorized reading and lazy dict decoding, etc. And here are my thoughts about integrating Spark a

Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-10-16 Thread ShaoFeng Shi

Hi guys, I uploaded the initial design document to JIRA, please feel free to comment: https://issues.apache.org/jira/browse/KYLIN-3621 ShaoFeng Shi 于2018年10月12日周五上午9:44写道： > JIRA and sub-tasks are created for this. Welcome to comment there: > https://issues.apache.org/jira/browse/KYLIN-3621

Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-10-11 Thread ShaoFeng Shi

JIRA and sub-tasks are created for this. Welcome to comment there: https://issues.apache.org/jira/browse/KYLIN-3621 ShaoFeng Shi 于2018年10月8日周一下午2:45写道： > I agree; the new storage should be Hadoop/HDFS compliant, and also need be > cloud storage (like S3, blob storage) friendly, as more and more

Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-10-07 Thread ShaoFeng Shi

I agree; the new storage should be Hadoop/HDFS compliant, and also need be cloud storage (like S3, blob storage) friendly, as more and more users are running big data analytics in the cloud. Luke Han 于2018年10月7日周日下午7:44写道： > It makes sense to bring a better storage option for Kylin. > > The opt

RE: [DISCUSS] Columnar storage engine for Apache Kylin

2018-10-07 Thread Wang, Ken

om: Luke Han Sent: 2018年10月7日 19:44 To: dev Subject: Re: [DISCUSS] Columnar storage engine for Apache Kylin It makes sense to bring a better storage option for Kylin. The option should be open and people could have different ways to create an adaptor for the underlying storage. Considering huge a

Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-10-07 Thread Luke Han

It makes sense to bring a better storage option for Kylin. The option should be open and people could have different ways to create an adaptor for the underlying storage. Considering huge adoptions of Kylin today are all run on Hadoop/HDFS, I prefer for Parquet or ORC or other HDFS compatible opti

Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-10-02 Thread Li Yang

Love this discussion. Like to highlight 3 major roles HBase is playing currently, so we don't miss any of them when looking for a replacement. 1) Storage: A high speed big data storage 2) Cache: A distributed storage cache layer (was BlockCache) 3) MPP: A distributed computation framework (was Cop

Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-10-01 Thread ShaoFeng Shi

Hi Billy, Yes, the cloud storage should be considered. The traditional file layouts on HDFS may not work well on cloud storage. Kylin needs to allow extension here. I will add this to the requirement. Billy Liu 于2018年9月29日周六下午3:22写道： > Hi Shaofeng, > > I'd like to add one more character: cloud

Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-09-29 Thread Billy Liu

Hi Shaofeng, I'd like to add one more character: cloud-native storage support. Quite a few users are using S3 on AWS, or Azure Data Lake Storage on Azure. If new storage engine could be more cloud friendly, more user could get benefits from it. With Warm regards Billy Liu ShaoFeng Shi 于2018年9月2

Re: Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-09-29 Thread ShaoFeng Shi

Hi Gang, very good questions, that's why we need to raise such a discussion publicly. Please check my comments below started with [shaofengshi]. Feel free to comment. 1. Is it possible to locate a cuboid quickly in a parquet file? How to save cuboid metadata info in the parquet's FileMetaData, jus

Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-09-28 Thread ShaoFeng Shi

Hi Yanghong, Thanks for your question. I think it is not required that other engines know how to read Kylin's storage, but it is a nice to have if possible. We can extend the file format if Parquet or ORC couldn't match Kylin's requirement, but not necessary to re-invent a new format. Zhong, Yang

Re:Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-09-28 Thread Ma Gang

I like parquet, it is very efficient format and supported by various projects, but there are some questions if we use parquet as the cube storage format: 1. Is it possible to locate a cuboid quickly in a parquet file? How to save cuboid metadata info in the parquet's FileMetaData, just in the m

Re: [DISCUSS] Columnar storage engine for Apache Kylin

2018-09-28 Thread Zhong, Yanghong

I have one question about the characteristics of Kylin columnar storage files. That is whether it should be a standard or common one. Since the data stored in the storage engine is Kylin specified, is it necessary for other engines to know how to build data into and how to read data from the sto

[DISCUSS] Columnar storage engine for Apache Kylin

2018-09-27 Thread ShaoFeng Shi

Hi Kylin developers. HBase has been Kylin’s storage engine since the first day; Kylin on HBase has been verified as a success which can support low latency & high concurrency queries on a very large data scale. Thanks to HBase, most Kylin users can get on average less than 1-second query response.

Re:Re: Re: [DISCUSS] Columnar storage engine for Apache Kylin

Re: [DISCUSS] Columnar storage engine for Apache Kylin

Re: [DISCUSS] Columnar storage engine for Apache Kylin

Re: [DISCUSS] Columnar storage engine for Apache Kylin

Re: [DISCUSS] Columnar storage engine for Apache Kylin

Re: [DISCUSS] Columnar storage engine for Apache Kylin

Re: [DISCUSS] Columnar storage engine for Apache Kylin

RE: [DISCUSS] Columnar storage engine for Apache Kylin

Re: [DISCUSS] Columnar storage engine for Apache Kylin

Re: [DISCUSS] Columnar storage engine for Apache Kylin

Re: [DISCUSS] Columnar storage engine for Apache Kylin

Re: [DISCUSS] Columnar storage engine for Apache Kylin

Re: Re: [DISCUSS] Columnar storage engine for Apache Kylin

Re: [DISCUSS] Columnar storage engine for Apache Kylin

Re:Re: [DISCUSS] Columnar storage engine for Apache Kylin

Re: [DISCUSS] Columnar storage engine for Apache Kylin

[DISCUSS] Columnar storage engine for Apache Kylin

17 matches

Site Navigation

Mail list logo

Footer information