Hi Balaji, Sorry for late response.
I used this query to test: select count(*), par, dashboard_id from dev.statistics_dashboard_visitor_hudi_rt where id = 20190 and par = '20190226' group by dashboard_id ; Thanks, Frank [email protected] <[email protected]> 于2019年3月1日周五 上午10:27写道: > Hi Kaka, > > I see. The output of "describe formatted <table>" formatting got messed up > in the email and I missed the partitioning column. > > Curious, How the query looks like. Is this a hive query ? Can you paste > it ? > > Balaji.V > On Wednesday, February 27, 2019, 6:18:38 PM PST, kaka chen < > [email protected]> wrote: > > > Hi Balaji, > > Thanks. But the table I used is partitioned by par. It cannot been > distinguished. > > Thanks, > Kaka > > Balaji Varadarajan <[email protected]> 于2019年2月28日周四 上午1:26写道: > > > Hi Kaka, > > Yes, this is expected as the table is non-partitioned. The 0.4.5 release > > which happened yesterday has the fix you referenced. > > Thanks,Balaji.V > > > > > > On Tuesday, February 26, 2019, 6:37:59 PM PST, kaka chen < > > [email protected]> wrote: > > > > BTW, because it cannot get partition field, after I merged with > > https://github.com/uber/hudi/pull/569/files, the job can run > successfully. > > > > Thanks, > > Frank > > > > kaka chen <[email protected]> 于2019年2月27日周三 上午10:34写道: > > > > > > > > I have tried two environments(Hive 2.1.1 and Hive 1.1.0-cdh5.15.1) both > > > cannot get the partition field. > > > > > > And I added simple logs to show the result: > > > > > > LOG.info("schema: " + schema + " partitioningFields: " + > > partitioningFields); > > > > > > 2019-02-26 19:53:47,855 INFO [main] > > com.uber.hoodie.hadoop.realtime.AbstractRealtimeRecordReader: schema: > > > {"type":"record","name":"test_record","namespace":"hoodie.test","fields":[{"name":"_hoodie_commit_time","type":["null","string"],"default":null},{"name":"_hoodie_commit_seqno","type":["null","string"],"default":null},{"name":"_hoodie_record_key","type":["null","string"],"default":null},{"name":"_hoodie_partition_path","type":["null","string"],"default":null},{"name":"_hoodie_file_name","type":["null","string"],"default":null},{"name":"id","type":["null","int"],"default":null},{"name":"user_id","type":["null","int"],"default":null},{"name":"dashboard_id","type":["null","int"],"default":null},{"name":"created_at","type":["null","string"],"default":null},{"name":"updated_at","type":["null","string"],"default":null},{"name":"timestamp","type":["null","long"],"default":null},{"name":"eventType","type":["null","string"],"default":null},{"name":"par","type":["null","string"],"default":null}]} > > partitioningFields: [] > > > > > > > > > > > > > > > desc formatted dev.statistics_dashboard_visitor_hudi_rt > > > > > > > > > Hive 2.1.1: > > > > > > > > > col_name, data_type, comment # col_name , data_type , comment , , > > > _hoodie_commit_time, string, _hoodie_commit_seqno, string, > > > _hoodie_record_key, string, _hoodie_partition_path, string, > > > _hoodie_file_name, string, id, int, user_id, int, dashboard_id, int, > > > created_at, string, updated_at, string, timestamp, bigint, eventtype, > > > string, , , # Partition Information, , # col_name , data_type , comment > > , , > > > par, string, , , # Detailed Table Information, , Database: , dev , > > Owner: , > > > app , CreateTime: , Mon Feb 25 12:07:34 CST 2019, LastAccessTime: , > > UNKNOWN > > > , Retention: , 0 , Location: , > > > > > > hdfs://yz-cluster-qa/user/hive/warehouse/dev.db/statistics_dashboard_visitor_hudi, > > > Table Type: , EXTERNAL_TABLE , Table Parameters:, , , EXTERNAL , TRUE , > > > spark.sql.sources.schema.numPartCols, 1 , > > > spark.sql.sources.schema.numParts, 1 , spark.sql.sources.schema.part.0, > > > > > > {\"type\":\"struct\",\"fields\":[{\"name\":\"_hoodie_commit_time\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"comment\":\"\",\"HIVE_TYPE_STRING\":\"string\"}},{\"name\":\"_hoodie_commit_seqno\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"comment\":\"\",\"HIVE_TYPE_STRING\":\"string\"}},{\"name\":\"_hoodie_record_key\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"comment\":\"\",\"HIVE_TYPE_STRING\":\"string\"}},{\"name\":\"_hoodie_partition_path\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"comment\":\"\",\"HIVE_TYPE_STRING\":\"string\"}},{\"name\":\"_hoodie_file_name\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"comment\":\"\",\"HIVE_TYPE_STRING\":\"string\"}},{\"name\":\"id\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"comment\":\"\",\"HIVE_TYPE_STRING\":\"int\"}},{\"name\":\"user_id\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"comment\":\"\",\"HIVE_TYPE_STRING\":\"int\"}},{\"name\":\"dashboard_id\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{\"comment\":\"\",\"HIVE_TYPE_STRING\":\"int\"}},{\"name\":\"created_at\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"comment\":\"\",\"HIVE_TYPE_STRING\":\"string\"}},{\"name\":\"updated_at\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"comment\":\"\",\"HIVE_TYPE_STRING\":\"string\"}},{\"name\":\"timestamp\",\"type\":\"long\",\"nullable\":true,\"metadata\":{\"comment\":\"\",\"HIVE_TYPE_STRING\":\"bigint\"}},{\"name\":\"eventType\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"comment\":\"\",\"HIVE_TYPE_STRING\":\"string\"}},{\"name\":\"par\",\"type\":\"string\",\"nullable\":true,\"metadata\":{\"HIVE_TYPE_STRING\":\"string\"}}]} > > > , spark.sql.sources.schema.partCol.0, par , transient_lastDdlTime, > > > 1551074880 , , # Storage Information, , SerDe Library: , > > > com.uber.hoodie.hadoop.realtime.HoodieParquetSerde, InputFormat: , > > > com.uber.hoodie.hadoop.realtime.HoodieRealtimeInputFormat, > OutputFormat: > > , > > > org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat, > > Compressed: > > > , No , Num Buckets: , -1 , Bucket Columns: , [] , Sort Columns: , [] , > > > Storage Desc Params:, , , serialization.format, 1 > > > > > > > > > > > > Hive 1.1.0-cdh5.15.1 > > > > > > > > > > > > -------------------------------+----------------------------------------------------+-----------------------+--+ > > > > > > | col_name | data_type > > > | comment | > > > > > > > > > > > > +-------------------------------+----------------------------------------------------+-----------------------+--+ > > > > > > | # col_name | data_type > > > | comment | > > > > > > | | NULL > > > | NULL | > > > > > > | _hoodie_commit_time | string > > > | | > > > > > > | _hoodie_commit_seqno | string > > > | | > > > > > > | _hoodie_record_key | string > > > | | > > > > > > | _hoodie_partition_path | string > > > | | > > > > > > | _hoodie_file_name | string > > > | | > > > > > > | id | int > > > | | > > > > > > | user_id | int > > > | | > > > > > > | dashboard_id | int > > > | | > > > > > > | created_at | string > > > | | > > > > > > | updated_at | string > > > | | > > > > > > | timestamp | bigint > > > | | > > > > > > | eventtype | string > > > | | > > > > > > | | NULL > > > | NULL | > > > > > > | # Partition Information | NULL > > > | NULL | > > > > > > | # col_name | data_type > > > | comment | > > > > > > | | NULL > > > | NULL | > > > > > > | par | string > > > | | > > > > > > | | NULL > > > | NULL | > > > > > > | # Detailed Table Information | NULL > > > | NULL | > > > > > > | Database: | dev > > > | NULL | > > > > > > | Owner: | hive > > > | NULL | > > > > > > | CreateTime: | Tue Feb 26 00:05:25 CST 2019 > > > | NULL | > > > > > > | LastAccessTime: | UNKNOWN > > > | NULL | > > > > > > | Protect Mode: | None > > > | NULL | > > > > > > | Retention: | 0 > > > | NULL | > > > > > > | Location: | > > > > > > hdfs://qabb-perf-alluxio-hadoop0:8020/user/hive/warehouse/dev.db/statistics_dashboard_visitor_hudi > > > | NULL | > > > > > > | Table Type: | EXTERNAL_TABLE > > > | NULL | > > > > > > | Table Parameters: | NULL > > > | NULL | > > > > > > | | EXTERNAL > > > | TRUE | > > > > > > | | numPartitions > > > | 1 | > > > > > > | | transient_lastDdlTime > > > | 1551110725 | > > > > > > | | NULL > > > | NULL | > > > > > > | # Storage Information | NULL > > > | NULL | > > > > > > | SerDe Library: | > > > com.uber.hoodie.hadoop.realtime.HoodieParquetSerde | NULL > > | > > > > > > | InputFormat: | > > > com.uber.hoodie.hadoop.realtime.HoodieRealtimeInputFormat | NULL > > > | > > > > > > | OutputFormat: | > > > org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat | NULL > > > | > > > > > > | Compressed: | No > > > | NULL | > > > > > > | Num Buckets: | -1 > > > | NULL | > > > > > > | Bucket Columns: | [] > > > | NULL | > > > > > > | Sort Columns: | [] > > > | NULL | > > > > > > | Storage Desc Params: | NULL > > > | NULL | > > > > > > | | serialization.format > > > | 1 | > > > > > > > > > > > > +-------------------------------+----------------------------------------------------+-----------------------+--+ > > > > > > Thanks, > > > Frank > > > > > > [email protected] <[email protected]> 于2019年2月27日周三 上午6:15写道: > > > > > >> Hi Frank, > > >> As Vinoth mentioned, can you share your environment (especially > > >> Hive/Spark version). Also, Can you paste the table definition as seen > in > > >> Hive metastore ( desc formatted <table_name> ) > > >> > > >> Balaji.V > > >> On Tuesday, February 26, 2019, 11:10:16 AM PST, Vinoth Chandar < > > >> [email protected]> wrote: > > >> > > >> Hi, > > >> > > >> Can you share more details about your environment and the full stack > > >> trace? > > >> > > >> Thanks > > >> Vinoth > > >> > > >> On Mon, Feb 25, 2019 at 11:10 PM kaka chen <[email protected]> > > wrote: > > >> > > >> > Hi All, > > >> > > > >> > AbstractRealtimeRecordReader cannot get the partition field from the > > >> > hive partition table by > > >> > String partitionFields = jobConf.get("partition_columns", ""); > > >> > > > >> > Thanks, > > >> > Frank > > >> > > > >> > > > > > > >
