Re: [jira] [Created] (KYLIN-2564) Got "UsernameNotFoundException: User XXX does not exist" in new Kylin instance
The problem is Got "UsernameNotFoundException: User XXX does not exist" On 5 Sep 2017, 14:34 +0800, Copperfield, wrote: > I also meet this problem when I use user:ADMIN to grant query privilege to > other user > I use kylin2.0.0, but the other user can login successfully. > Does someone have idea? > > -- > Sent from: http://apache-kylin.74782.x6.nabble.com/
error when start
Hi, all I meet an error when I start kylin: java.net.SocketTimeoutException: callTimeout=6, callDuration=115048: row 'kylin_metadata_acl,,' on table 'hbase:meta’ I have no idea why it happen
Re: about ConnectException when Extract Fact Table Distinct Columns
It’s the conflict of hadoop between cluster and hbase On 12 Feb 2017, 09:21 +0800, ShaoFeng Shi, wrote: > Did you try to add core-site.xml and yarn-site.xml to $KYLIN_HOME/conf? > That is a workaround that putting the missing configuration files to > Kylin's classpath. > > 2017-02-08 20:41 GMT+08:00 Copperfield : > > > Have you know why it happened? > > And did you reslove it? > > > > -- > > View this message in context: http://apache-kylin.74782.x6. > > nabble.com/about-ConnectException-when-Extract- > > Fact-Table-Distinct-Columns-tp5253p7143.html > > Sent from the Apache Kylin mailing list archive at Nabble.com. > > > > > > -- > Best regards, > > Shaofeng Shi 史少锋
Re: Streaming Build Timestamp parse error
I write a demo that used 1484236798000 as String in the following picture like process in Kylin code it always has this exception. I think it has to be changed to Class Times tamp in java.sql.Timestamp On 20 Jan 2017, 16:17 +0800, Mario Copperfield <xwhfcen...@gmail.com>, wrote: > you mean that if i use 1484236798 rather than 1484236798000? > > > On 20 Jan 2017, 16:03 +0800, ShaoFeng Shi <shaofeng...@apache.org>, wrote: > > I think you would prefer to write SQL like "where partitionCol > > ''2017-01-12 04:00:00", instead of "where partitionCol > 1484236798000", > > right? > > > > 2017-01-20 16:00 GMT+08:00 ShaoFeng Shi <shaofeng...@apache.org>: > > > > > Hi Wu xin, > > > > > > It is okay to use the unix time as the "tsColumn" in a Streaming cube; > > > Then Kylin will parse the value to get the date/time in -MM-DD or > > > -MM-DD HH:MM:SS format; While for the Cube "partition column", it > > > always need be a valid Date or Timestamp value, not epoch time in > > > miliseconds; > > > > > > 2017-01-20 14:22 GMT+08:00 Mario Copperfield <xwhfcen...@gmail.com>: > > > > > > > I think it might make a confuse, because we usually define timestamp as > > > > an unix timestamp such as “1484236798000" > > > > > > > > On 20 Jan 2017, 13:25 +0800, ShaoFeng Shi <shaofeng...@apache.org>, > > > > wrote: > > > > > "When i use timestamp as partition col, it occurs exception as > > > > > follow:"; > > > > > > > > > > Please aware that "1484236798000" isn't a timestamp value. Should use > > > > the "MINUTE_START" or "HOUR_START" as the cube partition column, which > > > > are > > > > real timestamp. > > > > > > > > > > > 2017-01-20 10:38 GMT+08:00 Mario Copperfield <xwhfcen...@gmail.com>: > > > > > > > Additionally, the stack log is in the following picture. > > > > > > > > > > > > > > On 20 Jan 2017, 10:27 +0800, Mario Copperfield < > > > > xwhfcen...@gmail.com>, wrote: > > > > > > > > I’m sorry i don’t found the property “tsParser”, did you mean > > > > that “Parser Name”? > > > > > > > > > > > > > > > > The following picture is my configure. > > > > > > > > > > > > > > > > I change a little codes so that it can work but i found that the > > > > master branch has changed the class “UpdateCubeInfoAfterBuildStep” so > > > > much > > > > that I did not pull request. > > > > > > > > > > > > > > > > This is what my change: > > > > > > > > https://github.com/xwhfcenter/kylin/commit/3e13bf244b8238710 > > > > 7b5b445ae9946daf919cf54 > > > > > > > > > > > > > > > > > > > > > > > > On 19 Jan 2017, 22:31 +0800, ShaoFeng Shi > > > > > > > > <shaofeng...@apache.org>, > > > > wrote: > > > > > > > > > Kylin's DefaultTimeParser will parse the long typed timestamp > > > > value > > > > > > > > > like "1484236798000", > > > > > > > > > so in this case you don't need specify "tsParser". > > > > > > > > > > > > > > > > > > 2017-01-19 22:18 GMT+08:00 <xwhfcen...@gmail.com>: > > > > > > > > > > > > > > > > > > > I'll check tomorrow > > > > > > > > > > > > > > > > > > > > Sent from Alto > > > > > > > > > > On Thursday, January 19, 2017 at 22:10 ShaoFeng Shi < > > > > > > > > > > shaofeng...@apache.org> wrote: > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > Could you provide the stack trace? Besdies, did you specifiy > > > > the "tsParser" > > > > > > > > > > property when define the table? if yes, what's the value? > > > > > > > > > > > > > > > > > > > > 2017-01-19 21:45 GMT+08:00 Li Yang : > > > > > > > > > > > > > > &
Re: Streaming Build Timestamp parse error
you mean that if i use 1484236798 rather than 1484236798000? On 20 Jan 2017, 16:03 +0800, ShaoFeng Shi <shaofeng...@apache.org>, wrote: > I think you would prefer to write SQL like "where partitionCol > ''2017-01-12 04:00:00", instead of "where partitionCol > 1484236798000", > right? > > 2017-01-20 16:00 GMT+08:00 ShaoFeng Shi <shaofeng...@apache.org>: > > > Hi Wu xin, > > > > It is okay to use the unix time as the "tsColumn" in a Streaming cube; > > Then Kylin will parse the value to get the date/time in -MM-DD or > > -MM-DD HH:MM:SS format; While for the Cube "partition column", it > > always need be a valid Date or Timestamp value, not epoch time in > > miliseconds; > > > > 2017-01-20 14:22 GMT+08:00 Mario Copperfield <xwhfcen...@gmail.com>: > > > > > I think it might make a confuse, because we usually define timestamp as > > > an unix timestamp such as “1484236798000" > > > > > > On 20 Jan 2017, 13:25 +0800, ShaoFeng Shi <shaofeng...@apache.org>, > > > wrote: > > > > "When i use timestamp as partition col, it occurs exception as follow:"; > > > > > > > > Please aware that "1484236798000" isn't a timestamp value. Should use > > > the "MINUTE_START" or "HOUR_START" as the cube partition column, which are > > > real timestamp. > > > > > > > > > 2017-01-20 10:38 GMT+08:00 Mario Copperfield <xwhfcen...@gmail.com>: > > > > > > Additionally, the stack log is in the following picture. > > > > > > > > > > > > On 20 Jan 2017, 10:27 +0800, Mario Copperfield < > > > xwhfcen...@gmail.com>, wrote: > > > > > > > I’m sorry i don’t found the property “tsParser”, did you mean > > > that “Parser Name”? > > > > > > > > > > > > > > The following picture is my configure. > > > > > > > > > > > > > > I change a little codes so that it can work but i found that the > > > master branch has changed the class “UpdateCubeInfoAfterBuildStep” so much > > > that I did not pull request. > > > > > > > > > > > > > > This is what my change: > > > > > > > https://github.com/xwhfcenter/kylin/commit/3e13bf244b8238710 > > > 7b5b445ae9946daf919cf54 > > > > > > > > > > > > > > > > > > > > > On 19 Jan 2017, 22:31 +0800, ShaoFeng Shi > > > > > > > <shaofeng...@apache.org>, > > > wrote: > > > > > > > > Kylin's DefaultTimeParser will parse the long typed timestamp > > > value > > > > > > > > like "1484236798000", > > > > > > > > so in this case you don't need specify "tsParser". > > > > > > > > > > > > > > > > 2017-01-19 22:18 GMT+08:00 <xwhfcen...@gmail.com>: > > > > > > > > > > > > > > > > > I'll check tomorrow > > > > > > > > > > > > > > > > > > Sent from Alto > > > > > > > > > On Thursday, January 19, 2017 at 22:10 ShaoFeng Shi < > > > > > > > > > shaofeng...@apache.org> wrote: > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > Could you provide the stack trace? Besdies, did you specifiy > > > the "tsParser" > > > > > > > > > property when define the table? if yes, what's the value? > > > > > > > > > > > > > > > > > > 2017-01-19 21:45 GMT+08:00 Li Yang : > > > > > > > > > > > > > > > > > > > Deserves a JIRA I think. > > > > > > > > > > > > > > > > > > > > Cheers > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > On Mon, Jan 16, 2017 at 2:34 PM, Copperfield wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > When i use timestamp as partition col, it occurs > > > exception as follow: > > > > > > > > > > > java.text.ParseException: Unparseable date: > > > "1484236798000" does not > > > > > > > > > > match > > > > > > > > > > > (\p{Nd}++)\Q-\E(\p{Nd}++)\Q-\E(\p{Nd}++)\Q > > > > > > > > > > > \E(\p{Nd}++)\Q:\E(\p{Nd}++)\Q:\E(\p{Nd}++) > > > > > > > > > > > my kylin version is 1.6.0 > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > View this message in context: > > > http://apache-kylin.74782.x6. > > > > > > > > > > > nabble.com/Streaming-Build-Tim > > > estamp-parse-error-tp6952.html > > > > > > > > > > > Sent from the Apache Kylin mailing list archive at > > > Nabble.com. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Best regards, > > > > > > > > > > > > > > > > > > Shaofeng Shi 史少锋 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Best regards, > > > > > > > > > > > > > > > > Shaofeng Shi 史少锋 > > > > > > > > > > > > > > > > -- > > > > Best regards, > > > > > > > > Shaofeng Shi 史少锋 > > > > > > > > > > > > > > > -- > > Best regards, > > > > Shaofeng Shi 史少锋 > > > > > > > -- > Best regards, > > Shaofeng Shi 史少锋
Re: Streaming Build Timestamp parse error
I think it might make a confuse, because we usually define timestamp as an unix timestamp such as “1484236798000" On 20 Jan 2017, 13:25 +0800, ShaoFeng Shi <shaofeng...@apache.org>, wrote: > "When i use timestamp as partition col, it occurs exception as follow:"; > > Please aware that "1484236798000" isn't a timestamp value. Should use the > "MINUTE_START" or "HOUR_START" as the cube partition column, which are real > timestamp. > > > 2017-01-20 10:38 GMT+08:00 Mario Copperfield <xwhfcen...@gmail.com>: > > > Additionally, the stack log is in the following picture. > > > > > > On 20 Jan 2017, 10:27 +0800, Mario Copperfield <xwhfcen...@gmail.com>, > > > wrote: > > > > I’m sorry i don’t found the property “tsParser”, did you mean that > > > > “Parser Name”? > > > > > > > > The following picture is my configure. > > > > > > > > I change a little codes so that it can work but i found that the master > > > > branch has changed the class “UpdateCubeInfoAfterBuildStep” so much > > > > that I did not pull request. > > > > > > > > This is what my change: > > > > https://github.com/xwhfcenter/kylin/commit/3e13bf244b82387107b5b445ae9946daf919cf54 > > > > > > > > > > > > On 19 Jan 2017, 22:31 +0800, ShaoFeng Shi <shaofeng...@apache.org>, > > > > wrote: > > > > > Kylin's DefaultTimeParser will parse the long typed timestamp value > > > > > like "1484236798000", > > > > > so in this case you don't need specify "tsParser". > > > > > > > > > > 2017-01-19 22:18 GMT+08:00 <xwhfcen...@gmail.com>: > > > > > > > > > > > I'll check tomorrow > > > > > > > > > > > > Sent from Alto > > > > > > On Thursday, January 19, 2017 at 22:10 ShaoFeng Shi < > > > > > > shaofeng...@apache.org> wrote: > > > > > > Hi, > > > > > > > > > > > > Could you provide the stack trace? Besdies, did you specifiy the > > > > > > "tsParser" > > > > > > property when define the table? if yes, what's the value? > > > > > > > > > > > > 2017-01-19 21:45 GMT+08:00 Li Yang : > > > > > > > > > > > > > Deserves a JIRA I think. > > > > > > > > > > > > > > Cheers > > > > > > > Yang > > > > > > > > > > > > > > On Mon, Jan 16, 2017 at 2:34 PM, Copperfield wrote: > > > > > > > > > > > > > > > > > > When i use timestamp as partition col, it occurs exception as > > > > > > > > follow: > > > > > > > > java.text.ParseException: Unparseable date: "1484236798000" > > > > > > > > does not > > > > > > > match > > > > > > > > (\p{Nd}++)\Q-\E(\p{Nd}++)\Q-\E(\p{Nd}++)\Q > > > > > > > > \E(\p{Nd}++)\Q:\E(\p{Nd}++)\Q:\E(\p{Nd}++) > > > > > > > > my kylin version is 1.6.0 > > > > > > > > > > > > > > > > -- > > > > > > > > View this message in context: http://apache-kylin.74782.x6. > > > > > > > > nabble.com/Streaming-Build-Timestamp-parse-error-tp6952.html > > > > > > > > Sent from the Apache Kylin mailing list archive at Nabble.com. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Best regards, > > > > > > > > > > > > Shaofeng Shi 史少锋 > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Best regards, > > > > > > > > > > Shaofeng Shi 史少锋 > > > > -- > Best regards, > > Shaofeng Shi 史少锋 >
Streaming Cube Bulkload
At the step of Load HFile to HBase Table, sometimes it will occur exception as follow: java.io.IOException: BulkLoad encountered an unrecoverable problem at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulkLoadPhase(LoadIncrementalHFiles.java:387) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:319) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:919) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.kylin.storage.hbase.steps.BulkLoadJob.run(BulkLoadJob.java:70) org.apache.hadoop.hbase.client.RpcRetryingCaller@39f0ffa8, org.apache.hadoop.hbase.RegionTooBusyException: org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 6 ms.
Re: Question: How to find commit for 1.6.0 release
you can use the tag kylin-1.6.0 On 12 Jan 2017, 10:29 +0800, Chuanlei Ni, wrote: > Hi, kylin devs > > I notice that the commit id for our 1.6.0 release is > "ed6982c8b3baaad08b7e4956001339979724d9a7", > but I cannot find this commit in the master. What shall I do to find this > commit? > > thx
Re: Consulting "EXTENDED_COLUMN"
Thanks, and now I understand! On 11 Jan 2017, 20:27 +0800, ShaoFeng Shi <shaofeng...@apache.org>, wrote: > It is similar but has difference: 1) "DERIVED" must be a column on lookup > table, "EXTENDED" doesn't need this, it can be a column on fact; 2) > "DERIVED" value is from lookup table's snapthost; "EXTENDED" value is from > this measure. > > 2017-01-11 18:24 GMT+08:00 Mario Copperfield <xwhfcen...@gmail.com>: > > > I have a question that is EXTENDED_COLUMN the same as DERIVED_COLUMN? > > > > On 1 Dec 2016, 07:35 +0800, Billy(Yiming) Liu <liuyiming@gmail.com>, > > wrote: > > > Thanks, Alberto. The explanation is accurate. EXTENDED_COLUMN is only > > used > > > for representation, but not filtering or grouping which is done by > > > HOST_COLUMN. So EXTENDED_COLUMN is not a dimension, it works like a > > > key/value map against the HOST_COLUMN. > > > > > > If the value in EXTENDED_COLUMN is not long, you could just define two > > > dimensions with joint dimension setting, it has almost the same > > performance > > > impact with EXTENDED_COLUMN which reduces one dimension, but better > > > understanding. > > > > > > 2016-11-30 19:00 GMT+08:00 Alberto Ramón <a.ramonporto...@gmail.com>: > > > > > > > This will help you > > > > http://kylin.apache.org/docs/howto/howto_optimize_cubes.html > > > > > > > > The idea is always, How I can reduce the number of Dimension ? > > > > If you reduce Dim, the time / resources to build the cube and final > > size of > > > > it decrease --> Its good > > > > > > > > An example can be DIM_Persons: Id_Person , Name, Surname, Address, > > . > > > > Id_Person can be HostColumn > > > > and other columns can be calculated from ID --> are Extended Column > > > > > > > > > > > > > > > > > > > > 2016-11-30 11:35 GMT+01:00 仇同心 <qiutong...@jd.com>: > > > > > > > > > Hi ,all > > > > > I don’t understand the usage scenarios of EXTENDED_COLUMN,although I > > saw > > > > > this article “https://issues.apache.org/jira/browse/KYLIN-1313”. > > > > > What,s the means about parameters of “Host Column” and “Extended > > Column”? > > > > > Why use this expression,and what aspects of optimization that this > > > > > expression solved? > > > > > Can be combined with a SQL statement to explain? > > > > > > > > > > > > > > > Thanks~ > > > > > > > > > > > > > > > > > > > > > -- > > > With Warm regards > > > > > > Yiming Liu (刘一鸣) > > > > > > -- > Best regards, > > Shaofeng Shi 史少锋
Re: Consulting "EXTENDED_COLUMN"
I have a question that is EXTENDED_COLUMN the same as DERIVED_COLUMN? On 1 Dec 2016, 07:35 +0800, Billy(Yiming) Liu, wrote: > Thanks, Alberto. The explanation is accurate. EXTENDED_COLUMN is only used > for representation, but not filtering or grouping which is done by > HOST_COLUMN. So EXTENDED_COLUMN is not a dimension, it works like a > key/value map against the HOST_COLUMN. > > If the value in EXTENDED_COLUMN is not long, you could just define two > dimensions with joint dimension setting, it has almost the same performance > impact with EXTENDED_COLUMN which reduces one dimension, but better > understanding. > > 2016-11-30 19:00 GMT+08:00 Alberto Ramón : > > > This will help you > > http://kylin.apache.org/docs/howto/howto_optimize_cubes.html > > > > The idea is always, How I can reduce the number of Dimension ? > > If you reduce Dim, the time / resources to build the cube and final size of > > it decrease --> Its good > > > > An example can be DIM_Persons: Id_Person , Name, Surname, Address, . > > Id_Person can be HostColumn > > and other columns can be calculated from ID --> are Extended Column > > > > > > > > > > 2016-11-30 11:35 GMT+01:00 仇同心 : > > > > > Hi ,all > > > I don’t understand the usage scenarios of EXTENDED_COLUMN,although I saw > > > this article “https://issues.apache.org/jira/browse/KYLIN-1313”. > > > What,s the means about parameters of “Host Column” and “Extended Column”? > > > Why use this expression,and what aspects of optimization that this > > > expression solved? > > > Can be combined with a SQL statement to explain? > > > > > > > > > Thanks~ > > > > > > > > > -- > With Warm regards > > Yiming Liu (刘一鸣)
Re: job build
OK, thanks, i got it! On 6 Dec 2016, 11:37 +0800, ShaoFeng Shi <shaofeng...@apache.org>, wrote: > Hi Copperfield, > > The "auto merge" is a "nice-to-have" feature for "partitioned" cube; For > partitioned cube, the build is always incremental. > > 2016-12-06 11:08 GMT+08:00 Mario Copperfield <xwhfcen...@gmail.com>: > > > Hi, all > > I have a question about job build. Does kylin always use incremental build > > strategy even we no set auto merge? > > > > > > -- > Best regards, > > Shaofeng Shi 史少锋
Re: Re: Can Kylin load data from Hbase?
I think kylin can not transfer data from hbase to hive On Mon, Nov 7, 2016 at 4:07 PM, 446463...@qq.com <446463...@qq.com> wrote: > Thanks! > sorry for that,I am pool in my English. > second question : > I have a table in HBase and it's not orderliness in columns. I want to > smooth it and export it in hive through kylin, can kylin do this ? > > > > 446463...@qq.com > > From: Alberto Ramón > Date: 2016-11-07 15:57 > To: dev > Subject: Re: Can Kylin load data from Hbase? > Hello;; > > See this: http://kylin.apache.org/development/plugin_arch.html > As Data Source, there is only two types: Hive and Kafka (in progress) > > (I don't understand your second Q) > > 2016-11-07 7:40 GMT+01:00 446463...@qq.com <446463...@qq.com>: > > > Hi: > > Can Kylin load data from HBase? > > And floating table which is anomaly in Hbase? > > > > > > > > 446463...@qq.com > > > -- Best regards, Amuro Copperfield
Re: java.lang.OutOfMemoryError: Java heap space
can try set KYLIN_JVM_SETTINGS in /bin/setenv.sh On Mon, Oct 10, 2016 at 4:55 PM, 沙漠火狐 <278211...@qq.com> wrote: > hi >I have a big lookuptable about 1.3G, when I cube it, in > /logs/kylin.out say : > > > Oct 10, 2016 4:20:22 PM org.apache.catalina.startup.Catalina start > INFO: Server startup in 24004 ms > Found segment ordrpath_goodsid[2016100800_2016100900] > # > # java.lang.OutOfMemoryError: Java heap space > # -XX:OnOutOfMemoryError="kill -9 %p" > # Executing /bin/sh -c "kill -9 23272"... > > > > > then the kylin is stop. > > > so I add JAVA_OPTS="-Xms256m -Xmx10240m -XX:PermSize=128m > -XX:MaxPermSize=4096m" config in catalin.sh > > > and change the config in kylin_job_conf_inmem.xml to > > mapreduce.map.memory.mb > 4096 > > > > > > mapreduce.map.java.opts > -Xmx10240m > > > > > I also change the propery in kylin.properies > kylin.dictionary.max.cardinality=5000 > kylin.table.snapshot.max_mb=1500 > > > > > > > > but the problem is still in.is there any config i should add to solve > this problem. thanks ! -- Best regards, Amuro Copperfield
Re: [Award] Kylin won InfoWord Bossie Awards 2016, again!
Congratulations! On Fri, Sep 23, 2016 at 12:35 AM, Luke Hanwrote: > Hi community, > You may already know, in the latest announced news from InfoWorld, > Apache Kylin has been selected for this year's Bossie Awards again: > *The Best Open Source Big Data Tools*. This is second time we won this. > There are 12 projects in this years list, they are TensorFlow, Beam, > Spark, Kylin > Kafka,Impala, Elasticsearch, SlamData, Zeppelin, Solr, Streamsets, Titan. > Most of > them are Apache projects (include incubating). Congrats to ASF! > >Would like to take this monument to say thanks to everybody, we > never could > won such industry recognition without everyone's contribution, thanks:-) > > Here's news link, enjoy it: > > http://www.infoworld.com/article/3120856/open-source- > tools/bossie-awards-2016-the-best-open-source-big-data-tools.html#slide9 > > Thanks. > Luke Han > -- Best regards, Amuro Copperfield
Re: Fwd: About Kafka in Kylin
OK, I'll try On 9/13/16, ShaoFeng Shi <shaofeng...@apache.org> wrote: > sorry one correction: > > > " if some messages are arrived late than the margin, it will not be lost" > should be: > " if some messages are arrived late than the margin, it will be lost" > > Mario, you can try to set a bigger margin value to reduce the possibility. > > 2016-09-13 22:05 GMT+08:00 ShaoFeng Shi <shaofeng...@apache.org>: > >> In 1.5.x streaming OLAP, kylin uses a timestamp range to seek the >> start/end offset in kafka, which is binary search; It allows a margin >> window, but if some messages are arrived late than the margin, it will >> not >> be lost; >> >> Now we're working on a new implementation, which will strictly use offset >> to fetch the new messages each time, so there will not be message lost. >> >> >> 2016-09-13 15:53 GMT+08:00 Billy(Yiming) Liu <liuyiming@gmail.com>: >> >>> The current design is still an experimental approach. Kafka could not >>> guarantee the global order, so we have to find other solution. The new >>> design Streaming OLAP solution will relay on the Kafka partition order, >>> instead of app timestamp. The code is under KYLIN-1726 branch still. >>> >>> 2016-09-13 15:46 GMT+08:00 Mario Copperfield <xwhfcen...@gmail.com>: >>> >>> > OK, Thank you >>> > >>> > On Tue, Sep 13, 2016 at 3:27 PM, Sarnath K <stell...@gmail.com> wrote: >>> > >>> > > Yes, that's true. If you are looking at an app timestamp(event >>> origin >>> > > time), then We can't binary search on it. Though Binary search may >>> be a >>> > > good approximation for the common case. >>> > > Not sure what Kylin is designed for. Let's wait to hear from the >>> experts! >>> > > >>> > > On Sep 13, 2016 12:49, "Mario Copperfield" <xwhfcen...@gmail.com> >>> wrote: >>> > > >>> > > > It's true that data appears in order in Kafka, but it can't assert >>> that >>> > > the >>> > > > timestamp of data is ordered, in fact, in real time it always >>> > > > appear >>> > > > without order >>> > > > >>> > > > On Tue, Sep 13, 2016 at 3:14 PM, Sarnath K <stell...@gmail.com> >>> wrote: >>> > > > >>> > > > > I am not sure about what Kylin does. But I know that data >>> appears in >>> > > > order >>> > > > > in Kafka broker. But the consumer can consume in any order that >>> > > > > it >>> > > likes. >>> > > > > So, offsets are more driven by Consumers and Kafka does not >>> > > > > have >>> a >>> > say >>> > > > on >>> > > > > it. >>> > > > > Sharing this based on my preliminary understanding of how Kafka >>> > works. >>> > > > > Best, >>> > > > > Sarnath >>> > > > > >>> > > > > On Sep 13, 2016 12:41, "Mario Copperfield" >>> > > > > <xwhfcen...@gmail.com> >>> > > wrote: >>> > > > > >>> > > > > > Dear all, >>> > > > > >I am using kylin streaming build, and when i read the >>> code >>> > > about >>> > > > > > this module, i found that kylin use binary search to find the >>> > offset >>> > > > > which >>> > > > > > is the closest adjust to the starttamp. I doubt that is that >>> work >>> > if >>> > > > the >>> > > > > > data in kafka is not order? >>> > > > > >Thanks and waits. >>> > > > > > >>> > > > > > >>> > > > > > -- >>> > > > > > Best regards, >>> > > > > > Amuro Copperfield >>> > > > > > >>> > > > > >>> > > > >>> > > > >>> > > > >>> > > > -- >>> > > > Best regards, >>> > > > Amuro Copperfield >>> > > > >>> > > >>> > >>> > >>> > >>> > -- >>> > Best regards, >>> > Amuro Copperfield >>> > >>> >>> >>> >>> -- >>> With Warm regards >>> >>> Yiming Liu (刘一鸣) >>> >> >> >> >> -- >> Best regards, >> >> Shaofeng Shi >> >> > > > -- > Best regards, > > Shaofeng Shi > -- Best regards, Amuro Copperfield
Fwd: About Kafka in Kylin
Dear all, I am using kylin streaming build, and when i read the code about this module, i found that kylin use binary search to find the offset which is the closest adjust to the starttamp. I doubt that is that work if the data in kafka is not order? Thanks and waits. -- Best regards, Amuro Copperfield