H All, I am trying to use HiveOutput Module to insert the ingested data into hive external table. The table is already created with the same location as /dt.application.<app_name>.operator.hiveOutput.prop.filePath/ property and partition column is accessdate. With below configurations in property file, the hdfs file structure I am expecting is
/common/data/test/accessCounts | ----- accessdate=2017-05-15 | ------- <fil1> ------- <fil2> ----- accessdate=2017-05-16 | ------- <fil1> ------- <fil2> but the actual structure look like /common/data/test/accessCounts/<yarn_application_id_for_apex_ingest_appl>/10 | ----- 2017-05-15 | ------- <fil1> ------- <fil2> | ----- 2017-05-16 | ------- <fil1> ------- <fil2> Questions 1. Why the yarn_application_id and some other extra directories are created when it is no where specified in config 2. If I want to achieve the structure I want, what other configurations I will need to set? HiveOutputModule Configs ================== <property> <name>dt.application.<app_name>.operator.hiveOutput.prop.filePath </name> <value>/common/data/test/accessCounts</value> </property> <property> <name>dt.application.<app_name>.operator.hiveOutput.prop.databaseUrl </name> <value><jdbc_url></value> </property> <property> <name>dt.application.<app_name>.operator.hiveOutput.prop.databaseDriver </name> <value>org.apache.hive.jdbc.HiveDriver</value> </property> <property> <name>dt.application.<app_name>.operator.hiveOutput.prop.tablename </name> <value><hive table name where records needs to be inserted></value> </property> <property> <name>dt.application.<app_name>.operator.hiveOutput.prop.hivePartitionColumns </name> <value>{accessdate}</value> </property> <property> <name>dt.application.<app_name>.operator.hiveOutput.prop.password </name> <value><hive connection password></value> </property> <property> <name>dt.application.<app_name>.operator.hiveOutput.prop.userName </name> <value><hive connection user></value> </property> <property> <name>dt.application.<app_name>.operator.hiveOutput.prop.hiveColumns </name> <value>{col1,col2,col3,col4}</value> </property> <property> <name>dt.application.<app_name>.operator.hiveOutput.prop.hiveColumnDataTypes </name> <value>{STRING,STRING,STRING,STRING}</value> </property> <property> <name>dt.application.<app_name>.operator.hiveOutput.prop.hivePartitionColumns </name> <value>{accessdate}</value> </property> <property> <name>dt.application.<app_name>.operator.hiveOutput.prop.hivePartitionColumnDataTypes </name> <value>{STRING}</value> </property> <property> <name>dt.application.<app_name>.operator.hiveOutput.prop.expressionsForHiveColumns </name> <value>{"getCol1()","getCol2()","getCol3()","getCol4()"}</value> </property> <property> <name>dt.application.<app_name>.operator.hiveOutput.prop.expressionsForHivePartitionColumns </name> <value>{"getAccessdate()"}</value> </property> -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/HiveOutputModule-creating-extra-directories-than-specified-while-saving-data-into-HDFS-tp1620.html Sent from the Apache Apex Users list mailing list archive at Nabble.com.