[jira] [Commented] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

ASF GitHub Bot (Jira) Thu, 09 Jul 2020 08:21:16 -0700


    [ 
https://issues.apache.org/jira/browse/KYLIN-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154663#comment-17154663
 ]


ASF GitHub Bot commented on KYLIN-4625:
---------------------------------------

bigxiaochu commented on a change in pull request #1312:
URL: https://github.com/apache/kylin/pull/1312#discussion_r452297764



##########
File path: 
kylin-spark-project/kylin-spark-engine/src/main/java/org/apache/kylin/engine/spark/source/CsvSource.java
##########
@@ -59,7 +59,12 @@ public CsvSource(KylinConfig config) {
                         && (parameters != null && parameters.get("separator") 
== null)) {
                     path = "file:///" + new File(getUtMetaDir(),
                             "../../examples/test_case_data/parquet_test/data/" 
+ table.identity() + ".csv")
-                                    .getAbsolutePath();
+                            .getAbsolutePath();
+                    separator = "";
+                } else if (kylinConfig.getDeployEnv().equals("LOCAL")) {
+                    path = "file:///" + new File(getUtMetaDir(),
+                            "../parquet_test/data/" + table.identity() + 
".csv")

Review comment:
       why ut path is "../../examples/test_case_data/parquet_test/data/", but 
local path is  "../parquet_test/data/"?
   can they use same path?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


> Debug the code of Kylin on Parquet without hadoop environment
> -------------------------------------------------------------
>
>                 Key: KYLIN-4625
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4625
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Spark Engine
>            Reporter: wangrupeng
>            Assignee: wangrupeng
>            Priority: Major
>         Attachments: image-2020-07-08-17-41-35-954.png, 
> image-2020-07-08-17-42-09-603.png, screenshot-1.png
>
>
> Currently, Kylin on Parquet already supports debuging source code with local 
> csv files and Not dependent on remote HDP sandbox, but it's a little bit 
> complex. The steps are as follows:
>  * edit the properties of 
> $KYLIN_SOURCE_DIR/examples/test_case_data/sandbox/kylin.properties to local
> {code:java}
>  kylin.metadata.url=$LOCAL_META_DIR
>  kylin.env.zookeeper-is-local=true
>  kylin.env.hdfs-working-dir=file:///path/to/local/dir
>  kylin.engine.spark-conf.spark.master=local
>  kylin.engine.spark-conf.spark.eventLog.dir=/path/to/local/dir
>  kylin.env=UT{code}
>  * debug org.apache.kylin.rest.DebugTomcat with IDEA && add VM option 
> "-Dspark.local=true"
>  !image-2020-07-08-17-41-35-954.png|width=574,height=363!
>  * Load csv data source by pressing button "Data Source->Load CSV File as 
> Table" on "Model" page, and set the schema for your table. Then press 
> "submit" to save.
>  !image-2020-07-08-17-42-09-603.png|width=577,height=259!
> Most time we debug just want to build and query cube quickly and focus the 
> bug we want to resolve. But current way is complex to load csv tables, create 
> model and cube and it's hard to use kylin sample cube. So, I want to add a 
> csv source which using the model of kylin sample data directly when debug 
> tomcat started.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KYLIN-4625) Debug the code of Kylin on Parquet without hadoop environment

Reply via email to