Hello Alberto, Thank you for your answer. I will look further for this mistake on the cube building.
Concerning the RAW measure, are you referring to this discussion ? I still can see this option on measures section on Kylin 2.2, that is why it kept my attention. Does it mean that to access raw data, we need to first use an aggregated measure ? My final users mainly use raw data (e.g. slicing), so I want to be sure on that. What about building cubes using only a table of facts with all the data inside ? Is it a conceivable way of doing (in terms of space storage, efficiency) or is it preferable to use separate tables foe dimensions and why ? Thank you in advance for your help. Have a good day. Best regards, Jean-Luc. De : Alberto Ramón [mailto:[email protected]] Envoyé : mercredi 28 février 2018 19:04 À : user <[email protected]> Objet : Re: Questions about 'RAW' measure Hello - RAW format are deprecated. You will find the thread in this MailList - "Job hasn't been submitted after" sound a configuration problem with your YARN, please find it on Google and review your CPU and RAM resources On 28 February 2018 at 16:44, BELLIER Jean-luc <[email protected]<mailto:[email protected]>> wrote: Hello I discovered that there wsas a RAW measure to get raw data instead of aggregated data (http://kylin.apache.org/blog/2016/05/29/raw-measure-in-kylin/) My assumption is that these raw data are stored in HBase, as aggregated data are, i.e. these data are duplicated from Hive into HBase. So my question is : are there limitations on the data volume ? My fact tables contain billions of rows and we need to get detailed information from them. So what are the restrictions, and also the benefits related to querying directly the data into Hive ? I have another question : I tested the way to create a model directly from a facts table containing raw data, in order to make a test of feasibility and avoid transformations (the table is a CSV file provided by an external team). I wanted in a first step to avoid creating files for the corresponding dimensions a generate a “clean” facts table having foreign keys corresponding to the primary keys of dimension tables. The creation of the model was OK. However the cube generation failed at first step, and I got this message : INFO : Query ID = hive_20180228120101_6990f9d4-182d-4dd9-b319-fce02caf75ef INFO : Total jobs = 3 INFO : Launching Job 1 out of 3 INFO : Starting task [Stage-1:MAPRED] in serial mode INFO : In order to change the average load for a reducer (in bytes): INFO : set hive.exec.reducers.bytes.per.reducer=<number> INFO : In order to limit the maximum number of reducers: INFO : set hive.exec.reducers.max=<number> INFO : In order to set a constant number of reducers: INFO : set mapreduce.job.reduces=<number> INFO : Starting Spark Job = 3556ecc6-2609-4085-bcca-b1b81fa9855c ERROR : Job hasn't been submitted after 61s. Aborting it. How could I process to avoid this. Are there kylin parameters (or other) to adjust ? Thank you in advance for your help. Have a good day. Best regards, Jean-Luc "Ce message est destiné exclusivement aux personnes ou entités auxquelles il est adressé et peut contenir des informations privilégiées ou confidentielles. Si vous avez reçu ce document par erreur, merci de nous l'indiquer par retour, de ne pas le transmettre et de procéder à sa destruction. This message is solely intended for the use of the individual or entity to which it is addressed and may contain information that is privileged or confidential. If you have received this communication by error, please notify us immediately by electronic mail, do not disclose it and delete the original message." "Ce message est destiné exclusivement aux personnes ou entités auxquelles il est adressé et peut contenir des informations privilégiées ou confidentielles. Si vous avez reçu ce document par erreur, merci de nous l'indiquer par retour, de ne pas le transmettre et de procéder à sa destruction. This message is solely intended for the use of the individual or entity to which it is addressed and may contain information that is privileged or confidential. If you have received this communication by error, please notify us immediately by electronic mail, do not disclose it and delete the original message."
