Hi ShaoFeng,
Very good questions, please see my comments start with [Gang]:
1) How to bridge the real-time cube with a cube built from Hive? You know,
in Kylin the source type is marked at the table level, which means a table
is either a Hive table, a JDBC table or a streaming table. To implement
You are welcome, ShaoFeng! Storage and query engine are inseparable and
should design together for fully gaining each other's abilities. And I'm
very excited about the new coming columnar storage and query engine!
--
Regards!
Aron Tao
ShaoFeng Shi 于2018年10月26日周五 下午10:28写道:
> Exactly; Than
Exactly; Thank you jiatao for the comments!
JiaTao Tao 于2018年10月25日周四 下午6:12写道:
> As far as I'm concerned, using Parquet as Kylin's storage format is pretty
> appropriate. From the aspect of integrating Spark, Spark made a lot of
> optimizations for Parquet, e.g. We can enjoy Spark's vectorized
As far as I'm concerned, using Parquet as Kylin's storage format is pretty
appropriate. From the aspect of integrating Spark, Spark made a lot of
optimizations for Parquet, e.g. We can enjoy Spark's vectorized reading and
lazy dict decoding, etc.
And here are my thoughts about integrating Spark a
Hi guys,
I uploaded the initial design document to JIRA, please feel free to comment:
https://issues.apache.org/jira/browse/KYLIN-3621
ShaoFeng Shi 于2018年10月12日周五 上午9:44写道:
> JIRA and sub-tasks are created for this. Welcome to comment there:
> https://issues.apache.org/jira/browse/KYLIN-3621
JIRA and sub-tasks are created for this. Welcome to comment there:
https://issues.apache.org/jira/browse/KYLIN-3621
ShaoFeng Shi 于2018年10月8日周一 下午2:45写道:
> I agree; the new storage should be Hadoop/HDFS compliant, and also need be
> cloud storage (like S3, blob storage) friendly, as more and more
I agree; the new storage should be Hadoop/HDFS compliant, and also need be
cloud storage (like S3, blob storage) friendly, as more and more users are
running big data analytics in the cloud.
Luke Han 于2018年10月7日周日 下午7:44写道:
> It makes sense to bring a better storage option for Kylin.
>
> The opt
om: Luke Han
Sent: 2018年10月7日 19:44
To: dev
Subject: Re: [DISCUSS] Columnar storage engine for Apache Kylin
It makes sense to bring a better storage option for Kylin.
The option should be open and people could have different ways to create an
adaptor for the underlying storage.
Considering huge a
It makes sense to bring a better storage option for Kylin.
The option should be open and people could have different ways to create an
adaptor for the underlying storage.
Considering huge adoptions of Kylin today are all run on Hadoop/HDFS, I
prefer for Parquet or ORC or other HDFS compatible opti
Love this discussion. Like to highlight 3 major roles HBase is playing
currently, so we don't miss any of them when looking for a replacement.
1) Storage: A high speed big data storage
2) Cache: A distributed storage cache layer (was BlockCache)
3) MPP: A distributed computation framework (was Cop
Hi Billy,
Yes, the cloud storage should be considered. The traditional file layouts
on HDFS may not work well on cloud storage. Kylin needs to allow extension
here. I will add this to the requirement.
Billy Liu 于2018年9月29日周六 下午3:22写道:
> Hi Shaofeng,
>
> I'd like to add one more character: cloud
Hi Shaofeng,
I'd like to add one more character: cloud-native storage support.
Quite a few users are using S3 on AWS, or Azure Data Lake Storage on
Azure. If new storage engine could be more cloud friendly, more user
could get benefits from it.
With Warm regards
Billy Liu
ShaoFeng Shi 于2018年9月2
Hi Gang, very good questions, that's why we need to raise such a discussion
publicly. Please check my comments below started with [shaofengshi]. Feel
free to comment.
1. Is it possible to locate a cuboid quickly in a parquet file? How to save
cuboid metadata info in the parquet's FileMetaData, jus
Hi Yanghong,
Thanks for your question. I think it is not required that other engines
know how to read Kylin's storage, but it is a nice to have if possible. We
can extend the file format if Parquet or ORC couldn't match Kylin's
requirement, but not necessary to re-invent a new format.
Zhong, Yang
I like parquet, it is very efficient format and supported by various projects,
but there are some questions if we use parquet as the cube storage format:
1. Is it possible to locate a cuboid quickly in a parquet file? How to save
cuboid metadata info in the parquet's FileMetaData, just in the m
I have one question about the characteristics of Kylin columnar storage files.
That is whether it should be a standard or common one. Since the data stored in
the storage engine is Kylin specified, is it necessary for other engines to
know how to build data into and how to read data from the sto
Hi Kylin developers.
HBase has been Kylin’s storage engine since the first day; Kylin on HBase
has been verified as a success which can support low latency & high
concurrency queries on a very large data scale. Thanks to HBase, most Kylin
users can get on average less than 1-second query response.
17 matches
Mail list logo