RANK/DENSE_RANK on KYLIN

2016-04-06 Thread sdangi
Does Kylin support these analytic functions?

I'm hitting into an issue while running this.  Works ok on hive.

SELECT CST_KEY, AMT,
RANK() OVER( ORDER BY AMT DESC)
FROM FCT

Message: Error while executing SQL "SELECT CST_KEY, AMT, RANK() OVER( ORDER
BY AMT DESC) FROM TXN_FCT_ORC LIMIT 5": cannot translate call RANK()
OVER (ORDER BY $t2 DESC RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)

*EXECUTION PLAN AFTER REWRITE*
OLAPToEnumerableConverter
  EnumerableLimit(fetch=[5])
EnumerableCalc(expr#0..3=[{inputs}], expr#4=[DENSE_RANK() OVER (ORDER BY
$t2 DESC RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)], CST_KEY=[$t1],
AMT=[$t2], EXPR$2=[$t4])
  OLAPTableScan(table=[[SCHEMA, FCT]], fields=[[0, 1, 2, 3]])

Caused by: java.lang.RuntimeException: cannot translate call RANK() OVER
(ORDER BY $t2 DESC RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
at
org.apache.calcite.adapter.enumerable.RexToLixTranslator.translateCall(RexToLixTranslator.java:533)
at
org.apache.calcite.adapter.enumerable.RexToLixTranslator.translate0(RexToLixTranslator.java:507)
at
org.apache.calcite.adapter.enumerable.RexToLixTranslator.translate(RexToLixTranslator.java:219)
at
org.apache.calcite.adapter.enumerable.RexToLixTranslator.translate0(RexToLixTranslator.java:472)
at
org.apache.calcite.adapter.enumerable.RexToLixTranslator.translate(RexToLixTranslator.java:219)
at
org.apache.calcite.adapter.enumerable.RexToLixTranslator.translate(RexToLixTranslator.java:214)
at
org.apache.calcite.adapter.enumerable.RexToLixTranslator.translateList(RexToLixTranslator.java:700)
at
org.apache.calcite.adapter.enumerable.RexToLixTranslator.translateProjects(RexToLixTranslator.java:189)
at
org.apache.calcite.adapter.enumerable.EnumerableCalc.implement(EnumerableCalc.java:188)
at
org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.visitChild(EnumerableRelImplementor.java:97)
at
org.apache.kylin.query.relnode.OLAPRel$JavaImplementor.visitChild(OLAPRel.java:184)
at
org.apache.calcite.adapter.enumerable.EnumerableLimit.implement(EnumerableLimit.java:106)
at
org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.visitChild(EnumerableRelImplementor.java:97)
at
org.apache.kylin.query.relnode.OLAPRel$JavaImplementor.visitChild(OLAPRel.java:184)
at
org.apache.kylin.query.relnode.OLAPToEnumerableConverter.implement(OLAPToEnumerableConverter.java:108)
at
org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.implementRoot(EnumerableRelImplementor.java:102)
at
org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:92)
at
org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:1171)
at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:297)
at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:196)
at
org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:721)
at
org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:588)
at
org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:558)
at
org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:214)
at
org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:573)
at
org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:571)
at
org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:135)


--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/RANK-DENSE-RANK-on-KYLIN-tp4066.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


VER1.5 -- Cannot find rowkey column DT_KEY in cube CubeDesc [name=TEST_CUBE]

2016-03-28 Thread sdangi
I have designed model/cubes in the past on version 1.2 and 1.3 no issue.  I'm
hitting this issue with 1.5.  Please check the model and cube JSON and let
me know if there is anything that stands out to cause this.


*Error Message
Cannot find rowkey column DT_KEY in cube CubeDesc [name=TEST_CUBE]*


MODEL:

{
  "uuid": "dd8395e2-0da3-48b1-8a0c-4165d477e7c5",
  "version": "1.5.0",
  "name": "TEST_MODEL",
  "description": "",
  "lookups": [
{
  "table": "SCHM.DT_DIM_ORC",
  "join": {
"type": "inner",
"primary_key": [
  "DT_KEY"
],
"foreign_key": [
  "TXN_BOOK_DT_KEY"
]
  }
},
{
  "table": "SCHM.CST_DIM_ORC",
  "join": {
"type": "inner",
"primary_key": [
  "CST_KEY"
],
"foreign_key": [
  "FIRM_CST_KEY"
]
  }
}
  ],
  "dimensions": [
{
  "table": "SCHM.TXN_FCT_ORC_SM",
  "columns": []
},
{
  "table": "SCHM.DT_DIM_ORC",
  "columns": [
"DT_KEY"
  ]
},
{
  "table": "SCHM.CST_DIM_ORC",
  "columns": [
"CST_NM"
  ]
}
  ],
  "metrics": [
"USD_TXN_AMT"
  ],
  "capacity": "MEDIUM",
  "last_modified": 1459175903495,
  "fact_table": "SCHM.TXN_FCT_ORC_SM",
  "filter_condition": "",
  "partition_desc": {
"partition_date_column": "SCHM.TXN_FCT_ORC_SM.TXN_BOOK_DT_KEY",
"partition_time_column": null,
"partition_date_start": 0,
"partition_date_format": "-MM-dd",
"partition_time_format": "HH:mm:ss",
"partition_type": "APPEND",
"partition_condition_builder":
"org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder"
  }
}


CUBE:
===

{
  "name": "TEST_CUBE",
  "model_name": "TEST_MODEL",
  "description": "",
  "dimensions": [
{
  "name": "CST_DIM_CST_NM",
  "table": "SCHM.CST_DIM_ORC",
  "derived": null,
  "column": "CST_NM"
},
{
  "name": "DT_DIM_DT_KEY",
  "table": "SCHM.DT_DIM_ORC",
  "derived": null,
  "column": "DT_KEY"
}
  ],
  "measures": [
{
  "name": "_COUNT_",
  "function": {
"expression": "COUNT",
"returntype": "bigint",
"parameter": {
  "type": "constant",
  "value": "1",
  "next_parameter": null
}
  }
},
{
  "name": "USD_TXN_AMT",
  "function": {
"expression": "SUM",
"returntype": "decimal(32,8)",
"parameter": {
  "type": "column",
  "value": "USD_TXN_AMT",
  "next_parameter": null
}
  }
}
  ],
  "rowkey": {
"rowkey_columns": [
  {
"column": "CST_NM",
"encoding": "dict"
  },
  {
"column": "DT_KEY",
"encoding": "dict"
  }
]
  },
  "aggregation_groups": [
{
  "includes": [
"CST_NM",
"DT_KEY"
  ],
  "select_rule": {
"hierarchy_dims": [],
"mandatory_dims": [],
"joint_dims": []
  }
}
  ],
  "partition_date_start": 138853440,
  "notify_list": [],
  "hbase_mapping": {
"column_family": [
  {
"name": "f1",
"columns": [
  {
"qualifier": "m",
"measure_refs": [
  "_COUNT_",
  "USD_TXN_AMT"
]
  }
]
  }
]
  },
  "retention_range": "0",
  "auto_merge_time_ranges": [
60480,
241920
  ],
  "engine_type": 2,
  "storage_type": 2
}

Thanks,
Regards,

--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/VER1-5-Cannot-find-rowkey-column-DT-KEY-in-cube-CubeDesc-name-TEST-CUBE-tp3982.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


java.lang.IllegalStateException: No resource found at

2016-03-05 Thread sdangi
Hi,

I have run too many jobs successfully but never hit any issues before on the
2nd step.  Not sure how this came exception about.  Any idea?


[pool-7-thread-10]:[2016-03-05
21:57:08,353][ERROR][org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:86)]
- error in FactDistinctColumnsJob
java.lang.IllegalStateException: No resource found at --
/cube_desc/TOPN_CST_CCY_BR_TXN_FINAL.json
at
org.apache.kylin.job.hadoop.AbstractHadoopJob.dumpResources(AbstractHadoopJob.java:352)
at
org.apache.kylin.job.hadoop.AbstractHadoopJob.attachKylinPropsAndMetadata(AbstractHadoopJob.java:298)
at
org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:81)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at
org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:120)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:51)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
at
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:130)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[pool-7-thread-10]:[2016-03-05
21:57:08,376][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob.cleanupTempConfFile(AbstractHadoopJob.java:401)]
- tempMetaFileString is : null
[pool-7-thread-10]:[2016-03-05
21:57:08,379][ERROR][org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:123)]
- error execute
MapReduceExecutable{id=8fc2c419-600a-4950-8e55-92547460adee-01, name=Extract
Fact Table Distinct Columns, state=RUNNING}
java.lang.IllegalStateException: No resource found at --
/cube_desc/TOPN_CST_CCY_BR_TXN_FINAL.json
at
org.apache.kylin.job.hadoop.AbstractHadoopJob.dumpResources(AbstractHadoopJob.java:352)
at
org.apache.kylin.job.hadoop.AbstractHadoopJob.attachKylinPropsAndMetadata(AbstractHadoopJob.java:298)
at
org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:81)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at
org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:120)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:51)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
at
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:130)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/java-lang-IllegalStateException-No-resource-found-at-tp3802.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


Re: [jira] [Created] (KYLIN-1089) Kylin failed to run on CDH with HBase 1.0

2016-03-03 Thread sdangi
Team -- More than happy to.  Can some one guide me on how to get the patch
out?  

Thanks,
Regards,

--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/jira-Created-KYLIN-1089-Kylin-failed-to-run-on-CDH-with-HBase-1-0-tp2027p3784.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


Re: [jira] [Created] (KYLIN-1089) Kylin failed to run on CDH with HBase 1.0

2016-02-27 Thread sdangi
I have some good news.  I have rebuilt the current Kylin Master1.2  against
CDH5.5.3.  I had to bring some of 1.1.3 branch changes dealing with Cloudera
API around additional methods for RegionScanner interface.  

Also, I had exclude some HBase (server) dependencies in the pom and upgrade
curator binaries.  Compiles/Builds ok, server starts ok,  kylin sample cube
build and query works fine.  I did apply this to our ongoing POC with over
1B rows.  Build and most queries work.  However, there is one query against
time dimension (which had reported to Luke directly due to sensitive nature)
and that is throwing a new error.  Earlier it was AbstractMethod error
related to getBatch interface of Scanner that is now fixed due to binary
compatibility with CDH5.5.2. 

But now, it fails with   


Caused by:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException):
org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.AbstractMethodError
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2065)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.AbstractMethodError
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2278)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32205)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2034)
... 4 more

at
org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1219)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32651)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:213)


I have remote debugged it.  Here is the stack

 

Similar query (See bel0w) on the sample cube works great.

//this works
SELECT SUM(PRICE) FROM  KYLIN_SALES
WHERE PART_DT=DATE'2012-01-01'

//this does not (1B rows in fact table)
SELECT
  CURRENCY
,SUM(TXN_AMT)  TOT_USD_TXN_AMT
FROM TXN_FCT_ORC as TXN_FCT_ORC
INNER JOIN DIM_ORC as DT_DIM_ORC
ON BOOK_DT_KEY = DT_DIM_ORC.DT_KEY
INNER JOIN CCY_DIM_ORC as CCY_DIM_ORC
ON TXN_FCT_ORC.RMTR_CCY_KEY = CCY_DIM_ORC.CCY_KEY
WHERE DT_DIM_ORC.DT_KEY=date'2015-03-03'
GROUP BY
CCY_DIM_ORC.CCY_NM, DT_DIM_ORC.CDR_YR
ORDER BY TOT_USD_TXN_AMT  DESC




--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/jira-Created-KYLIN-1089-Kylin-failed-to-run-on-CDH-with-HBase-1-0-tp2027p3745.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


Re: [jira] [Created] (KYLIN-1089) Kylin failed to run on CDH with HBase 1.0

2016-02-27 Thread sdangi
Team -- Any successful builds against CDH5.5.x? I have just attempted it with
changes in job/storage package to fix the HBase interface changes borrowed
from 1.1.4 branch.  1.0.0-cdh5.5.x version of HBase from Cloudera is not
aligning with 1.0.0 from Apache.  However, I have got a clean build and am
testing it.



The pom is as below:

   
2.6.0-cdh5.5.2
2.6.0-cdh5.5.2
3.4.5-cdh5.5.2
1.1.0-cdh5.5.2
   
1.1.0-cdh5.5.2
   
1.0.0-cdh5.5.2



INFO]

[INFO] Reactor Summary:
[INFO] 
[INFO] Kylin:HadoopOLAPEngine . SUCCESS [  0.919
s]
[INFO] Kylin:AtopCalcite .. SUCCESS [  2.660
s]
[INFO] Kylin:Common ... SUCCESS [ 46.248
s]
[INFO] Kylin:Metadata . SUCCESS [  5.385
s]
[INFO] Kylin:Dictionary ... SUCCESS [  1.558
s]
[INFO] Kylin:Cube . SUCCESS [  3.652
s]
[INFO] Kylin:InvertedIndex  SUCCESS [  0.635
s]
[INFO] Kylin:Job .. SUCCESS [  7.054
s]
[INFO] Kylin:Storage .. SUCCESS [  3.900
s]
[INFO] Kylin:Query  SUCCESS [  1.272
s]
[INFO] Kylin:JDBC . SUCCESS [  2.235
s]
[INFO] Kylin:RESTServer ... SUCCESS [ 11.829
s]
[INFO] Kylin:Monitor .. SUCCESS [  1.129
s]
[INFO]


I will report any issues on running and building the cubes.


--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/jira-Created-KYLIN-1089-Kylin-failed-to-run-on-CDH-with-HBase-1-0-tp2027p3742.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


Re: TopN Results Differ in Hive and Kylin

2016-01-17 Thread sdangi
I was looking at the commits. 10 days back, 2 files (QueryReqeust and
SQLRequest.java) were touched that have fixed the issue for clients using
REST API, changing the default acceptPartial to false. 

I tried with Zeppelin and it works and results from Hive and Zeppelin (via
REST API) work. However, the response time has now gone really up and
queries are taking far longer.  Where I should be looking to improve the
performance.  It seems that as part of HBase scans, the dictionary info is
repeatedly loaded - can we not read it from the cache?


[http-bio-80-exec-6]:[2016-01-17
09:27:33,501][DEBUG][org.apache.kylin.dict.DictionaryManager.load(DictionaryManager.java:378)]
- Going to load DictionaryInfo from
/dict/KYLIN_DEMO.CUSTOMER_DIM_T/L1_CST_NM/3c9c05aa-c8e9-4bfa-861b-2f3794cb3209.dict
2016-01-17 09:27:33,726 WARN  [http-bio-80-exec-6] hdfs.DFSClient:
DFSInputStream has been closed already
[http-bio-80-exec-6]:[2016-01-17
09:27:33,726][DEBUG][org.apache.kylin.dict.DictionaryManager.load(DictionaryManager.java:382)]
- Loaded dictionary at
/dict/KYLIN_DEMO.CUSTOMER_DIM_T/L1_CST_NM/3c9c05aa-c8e9-4bfa-861b-2f3794cb3209.dict
[http-bio-80-exec-6]:[2016-01-17
09:27:35,443][DEBUG][org.apache.kylin.dict.DictionaryManager.load(DictionaryManager.java:378)]
- Going to load DictionaryInfo from
/dict/KYLIN_DEMO.CUSTOMER_DIM_T/L1_CST_NM/3c9c05aa-c8e9-4bfa-861b-2f3794cb3209.dict
2016-01-17 09:27:35,669 WARN  [http-bio-80-exec-6] hdfs.DFSClient:
DFSInputStream has been closed already
[http-bio-80-exec-6]:[2016-01-17
09:27:35,669][DEBUG][org.apache.kylin.dict.DictionaryManager.load(DictionaryManager.java:382)]
- Loaded dictionary at
/dict/KYLIN_DEMO.CUSTOMER_DIM_T/L1_CST_NM/3c9c05aa-c8e9-4bfa-861b-2f3794cb3209.dict
[http-bio-80-exec-6]:[2016-01-17
09:27:37,469][DEBUG][org.apache.kylin.dict.DictionaryManager.load(DictionaryManager.java:378)]
- Going to load DictionaryInfo from
/dict/KYLIN_DEMO.CUSTOMER_DIM_T/L1_CST_NM/3c9c05aa-c8e9-4bfa-861b-2f3794cb3209.dict
2016-01-17 09:27:37,697 WARN  [http-bio-80-exec-6] hdfs.DFSClient:
DFSInputStream has been closed already
[http-bio-80-exec-6]:[2016-01-17
09:27:37,697][DEBUG][org.apache.kylin.dict.DictionaryManager.load(DictionaryManager.java:382)]
- Loaded dictionary at
/dict/KYLIN_DEMO.CUSTOMER_DIM_T/L1_CST_NM/3c9c05aa-c8e9-4bfa-861b-2f3794cb3209.dict

Thanks,



--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/TopN-Results-Differ-in-Hive-and-Kylin-tp3288p3296.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


Re: TopN Results Differ in Hive and Kylin

2016-01-16 Thread sdangi
>From FAQs
Getting wrong result for the query with order by

By default if you're making queries on the web client, a mode called
"AcceptPartialResults" is enabled​, this is a protection mechanism that will
only return part of the results to reduce server overhead. Honestly it might
hurt the correctness of order by queries.

If you're seeking 100% correctness, after running the query you will find a
notification: Note: Current results are partial, please click 'Show All'
button to get all results. Click the "Show All" button to disable the
"AcceptPartialResults" mode, and you'll get a right result.

Notice "AcceptPartialResults" is only enabled by default at web client,
you'll not meet such problems if you're using JDBC, ODBC or standard REST
API.

--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/TopN-Results-Differ-in-Hive-and-Kylin-tp3288p3292.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


TopN Results Differ in Hive and Kylin

2016-01-16 Thread sdangi
I'm hitting an issue and am not able to understand clearly what the root
cause is.  

I have customer,time dimension and a transaction fact table.  Customer table
has 1MM rows and transaction fact 154MM.

I have to find topN customers by transaction volume for a year. I ran the
following query in Hive for say year 2015.

TXN_AMT in transaction fact table is of type decimal(32,8)

SELECT
  CUSTOMER_DIM_T.NM, SUM(TXN_AMT) TOTAL_TXN_AMT  
FROM TXN_FCT_T as TXN_FCT_T 
INNER JOIN CUSTOMER_DIM_T as CUSTOMER_DIM_T
ON TXN_FCT_T.CST_KEY = CUSTOMER_DIM_T.KEY
INNER JOIN DATE_DIM_T as DATE_DIM_T
ON TXN_FCT_T.TXN_BK_DT_KEY = DATE_DIM_T.DT_KEY
WHERE DATE_DIM_T.CAL_YR=2015
GROUP BY
CUSTOMER_DIM_T.NM
ORDER BY TOTAL_TXN_AMT DESC LIMIT 10;

CUSTAMT

A 177070809652.52
B 156918629669.59
C 145634838958.87
D 137781561987.28
E 137470272887.12
F 136782827759.93
G 129986897552.65
H 127105433950.33
I 115152934891.42
J 107505491520.64

The same query if run in Kylin, I get a totally different answer what is
bothering me is as I begin to increase the LIMIT number, the response keeps
on changing.  I keep on getting a different set of customers with higher txn
amts.

CUSTAMT

K 4014698676
L 2727082526
M 1344354210
N 1216966910
O 554963079.2
P 390827714.4
Q 367123639.6
R 347732036.1
S 313686532
T 311987500.6

Aren't the results first sorted then limited in Kylin?

If I run the query in Kylin filtering on customer A above, the cuboid seems
to have the right aggregated data

SELECT
  CUSTOMER_DIM_T.NM, SUM(TXN_AMT) TOTAL_TXN_AMT  
FROM TXN_FCT_T as TXN_FCT_T 
INNER JOIN CUSTOMER_DIM_T as CUSTOMER_DIM_T
ON TXN_FCT_T.CST_KEY = CUSTOMER_DIM_T.KEY
INNER JOIN DATE_DIM_T as DATE_DIM_T
ON TXN_FCT_T.TXN_BK_DT_KEY = DATE_DIM_T.DT_KEY
WHERE DATE_DIM_T.CAL_YR=2015 and */CUSTOMER_DIM_T.NM='A'/*
GROUP BY
CUSTOMER_DIM_T.NM
ORDER BY TOTAL_TXN_AMT DESC LIMIT 10;


CUSTAMT

A 177070809652.52

Any ideas?

Thanks,

--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/TopN-Results-Differ-in-Hive-and-Kylin-tp3288.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


Kylin and Tableau -- Top N query

2016-01-14 Thread sdangi
Results from Kylin and Tableau on a live connection don't match.  Any reason? 
I'm creating a custom data source (Custom SQL Query) in Tableau and adding a
parameter control using a query similar to below:

SELECT
t2.c1
,sum(t1.c2) AS c3
FROM t1
Inner join t2
on t1.k1 = t2.k1
group by t2.c1
order by c3
LIMIT 

t1 (fact) has 130MM rows and t2 (dimension) has 1.7MM

The query shows different Top N records in Tableau as compared to Kylin and
Hive.

Thanks,
Regards,

--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/Kylin-and-Tableau-Top-N-query-tp3250.html
Sent from the Apache Kylin mailing list archive at Nabble.com.


Error when scan from lower key

2016-01-06 Thread sdangi
Hi Kylin Team --



I'm hitting the below error when executing query with the time dimension in
the where clause against  a 35GB cube (No issues running otherwise). Same
exact query works fine against Hive with the WHERE clause.

Any ideas?

Thanks,
Regards,

Error when scan from lower key � ;8 to upper key � on
table KYLIN_PWNXRLXEZ7. while executing SQL: "SELECT
CUSTOMER_DIM_T.PRN_CST_NM ,CURRENCY_DIM_T.SHRT_NM ,BRANCH_DIM_T.BR_NM
,COUNTRY_DIM_T.CTY_NM ,SUM(AML_TXN_FCT_CUB_T.USD_TXN_AMT)
,SUM(AML_TXN_FCT_CUB_T.AC_CCY_TXN_AMT) FROM
DL_FINSERV_DEMO.AML_TXN_FCT_CUB_T as AML_TXN_FCT_CUB_T INNER JOIN
DL_FINSERV_DEMO.CUSTOMER_DIM_T as CUSTOMER_DIM_T ON
AML_TXN_FCT_CUB_T.FIRM_CST_KEY = CUSTOMER_DIM_T.KEY INNER JOIN
DL_FINSERV_DEMO.BRANCH_DIM_T as BRANCH_DIM_T ON AML_TXN_FCT_CUB_T.TXN_BR_KEY
= BRANCH_DIM_T.BR_KEY INNER JOIN DL_FINSERV_DEMO.COUNTRY_DIM_T as
COUNTRY_DIM_T ON AML_TXN_FCT_CUB_T.RMTR_CTY_KEY = COUNTRY_DIM_T.CTY_KEY
INNER JOIN DL_FINSERV_DEMO.CURRENCY_DIM_T as CURRENCY_DIM_T ON
AML_TXN_FCT_CUB_T.RMTR_CCY_KEY = CURRENCY_DIM_T.KEY INNER JOIN
DL_FINSERV_DEMO.DATE_DIM_T as DATE_DIM_T ON
AML_TXN_FCT_CUB_T.TXN_BOOK_DT_KEY = DATE_DIM_T.DT_KEY 

WHERE DATE_DIM_T.DT_KEY >= date'2015-04-01' 

GROUP BY CUSTOMER_DIM_T.PRN_CST_NM ,CURRENCY_DIM_T.SHRT_NM
,BRANCH_DIM_T.BR_NM ,COUNTRY_DIM_T.CTY_NM LIMIT 5000"

--
View this message in context: 
http://apache-kylin.74782.x6.nabble.com/Error-when-scan-from-lower-key-tp3083.html
Sent from the Apache Kylin mailing list archive at Nabble.com.