I removed the code for long type in BitmapCounter as the casting will get things wrong (but the target is to provide accurate value); @Yerui, for you awareness; once we find the solution for long, then add it back.
2016-01-28 22:13 GMT+08:00 ShaoFeng Shi <[email protected]>: > what's the cardinality of the dimension that you want to count distinct > values? Integer's range is enough for most cases, if your case is under > this scope, you can try the bitmap with integer; but you need map the value > to an unique id and use that within the bitmap. For example, if you want to > count distinct users, use the numeric user_id, instead of email address; To > support other data types, as Hongbin mentioned, the storage cost is very > high, we don't have that plan. > > > > > > 2016-01-28 20:54 GMT+08:00 hongbin ma <[email protected]>: > >> KYLIN-1186 <https://issues.apache.org/jira/browse/KYLIN-1186> is not a >> mature feature yet and it only supports integer >> we don't yet have plans to support any other forms of precise distinct >> count, as it is too expensive to pre-calculate >> >> On Thu, Jan 28, 2016 at 6:56 PM, Abhilash L L <[email protected]> >> wrote: >> >> > Thanks ShaoFeng Shi, >> > >> > We might need for other data types as well >> > >> > date & string >> > >> > (eg, distinct count of dates of certain activity) >> > >> > So in the rest call instead of hllc return type it should be bitmap for >> > int,tinyint etc ? >> > >> > And we still send it as hllc for other data types ? >> > >> > >> > Also in one of the comments, it said we cast long to int.. wont we be >> > losing data due to truncation ? >> > >> > >> > Regards, >> > Abhilash >> > >> > On Thu, Jan 28, 2016 at 3:43 PM, ShaoFeng Shi <[email protected]> >> > wrote: >> > >> > > is this matched your case? >> > > https://issues.apache.org/jira/browse/KYLIN-1186 >> > > >> > > 2016-01-28 17:42 GMT+08:00 Abhilash L L <[email protected]>: >> > > >> > > > +user ml >> > > > >> > > > Regards, >> > > > Abhilash >> > > > >> > > > On Thu, Jan 28, 2016 at 11:32 AM, Abhilash L L < >> [email protected]> >> > > > wrote: >> > > > >> > > > > Hello, >> > > > > >> > > > > Is there a way to ask Kylin to get exact distinct count ? From >> > what >> > > > we >> > > > > understand, we can choose between hllc(10) to hllc(16) >> > > > > >> > > > > I understand that for every cuboid, you will need to go through >> > the >> > > > > whole data set again, but with the new cubing algo (2.x branch) >> > should >> > > be >> > > > > simpler to add ? >> > > > > >> > > > > If currently not present are there any plans to introduce this >> ? >> > > > > >> > > > > Regards, >> > > > > Abhilash >> > > > > >> > > > >> > > >> > > >> > > >> > > -- >> > > Best regards, >> > > >> > > Shaofeng Shi >> > > >> > >> >> >> >> -- >> Regards, >> >> *Bin Mahone | 马洪宾* >> Apache Kylin: http://kylin.io >> Github: https://github.com/binmahone >> > > > > -- > Best regards, > > Shaofeng Shi > > -- Best regards, Shaofeng Shi
