Vivek, Take a look at HiveOutputModule.populateDAG() ( https://github.com/apache/apex-malhar/blob/master/hive/src/main/java/org/apache/apex/malhar/hive/HiveOutputModule.java )
This is a sub-DAG with fsRolling (FSPojoToHiveOperator) and hiveStore ( FSPojoToHiveOperator) using the file-path you have supplied ( /common/data/test/accessCounts). If you look at the code in com.datatorrent.contrib.hive.AbstractFSRollingOutputOperator.setup(OperatorContext) (superclass of FSPojoToHiveOperator) it does construct a path for rolling temporary files along the lines you have observed. But the final output should be in the output path you specified if you wait long enough for the creation of those files. On Tue, May 16, 2017 at 12:53 PM, bhidevivek <bhide.vi...@gmail.com> wrote: > H All, I am trying to use HiveOutput Module to insert the ingested data > into > hive external table. The table is already created with the same location as > /dt.application.<app_name>.operator.hiveOutput.prop.filePath/ property and > partition column is accessdate. With below configurations in property file, > the hdfs file structure I am expecting is > > /common/data/test/accessCounts > | > ----- accessdate=2017-05-15 > | > > ------- <fil1> > > ------- <fil2> > ----- accessdate=2017-05-16 > | > > ------- <fil1> > > ------- <fil2> > > but the actual structure look like > > /common/data/test/accessCounts/<yarn_application_id_for_apex_ > ingest_appl>/10 > > | > > ----- 2017-05-15 > > | > > ------- <fil1> > > ------- <fil2> > > | > > ----- 2017-05-16 > > | > > ------- <fil1> > > ------- <fil2> > > Questions > 1. Why the yarn_application_id and some other extra directories are created > when it is no where specified in config > 2. If I want to achieve the structure I want, what other configurations I > will need to set? > > HiveOutputModule Configs > ================== > > <property> > <name>dt.application.<app_name>.operator.hiveOutput. > prop.filePath > </name> > <value>/common/data/test/accessCounts</value> > </property> > <property> > <name>dt.application.<app_name>.operator.hiveOutput. > prop.databaseUrl > </name> > <value><jdbc_url></value> > </property> > <property> > <name>dt.application.<app_name>.operator.hiveOutput. > prop.databaseDriver > </name> > <value>org.apache.hive.jdbc.HiveDriver</value> > </property> > <property> > <name>dt.application.<app_name>.operator.hiveOutput. > prop.tablename > </name> > <value><hive table name where records needs to be > inserted></value> > </property> > <property> > > <name>dt.application.<app_name>.operator.hiveOutput. > prop.hivePartitionColumns > </name> > <value>{accessdate}</value> > </property> > <property> > <name>dt.application.<app_name>.operator.hiveOutput. > prop.password > </name> > <value><hive connection password></value> > </property> > <property> > <name>dt.application.<app_name>.operator.hiveOutput. > prop.userName > </name> > <value><hive connection user></value> > </property> > <property> > <name>dt.application.<app_name>.operator.hiveOutput. > prop.hiveColumns > </name> > <value>{col1,col2,col3,col4}</value> > </property> > <property> > > <name>dt.application.<app_name>.operator.hiveOutput. > prop.hiveColumnDataTypes > </name> > <value>{STRING,STRING,STRING,STRING}</value> > </property> > <property> > > <name>dt.application.<app_name>.operator.hiveOutput. > prop.hivePartitionColumns > </name> > <value>{accessdate}</value> > </property> > <property> > > <name>dt.application.<app_name>.operator.hiveOutput.prop. > hivePartitionColumnDataTypes > </name> > <value>{STRING}</value> > </property> > <property> > > <name>dt.application.<app_name>.operator.hiveOutput. > prop.expressionsForHiveColumns > </name> > <value>{"getCol1()","getCol2()","getCol3()","getCol4()"}</ > value> > </property> > <property> > > <name>dt.application.<app_name>.operator.hiveOutput.prop. > expressionsForHivePartitionColumns > </name> > <value>{"getAccessdate()"}</value> > </property> > > > > -- > View this message in context: http://apache-apex-users-list. > 78494.x6.nabble.com/HiveOutputModule-creating-extra-directories-than- > specified-while-saving-data-into-HDFS-tp1620.html > Sent from the Apache Apex Users list mailing list archive at Nabble.com. >