[
https://issues.apache.org/jira/browse/HUDI-8553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900184#comment-17900184
]
Jonathan Vexler commented on HUDI-8553:
---------------------------------------
I have verified this with the script:
{code:java}
SET hoodie.spark.sql.optimized.writes.enable = false;
CREATE TABLE table2 ( ts BIGINT, uuid STRING, rider STRING,
driver STRING, fare DOUBLE, city STRING ) USING HUDI LOCATION
'file:///tmp/testpositions' TBLPROPERTIES ( type = 'mor', primaryKey =
'uuid', preCombineField = 'ts' ) PARTITIONED BY (city);
INSERT INTO table2 VALUES
(1695159649087,'334e26e9-8355-45cc-97c6-c31daf0df330','rider-A','driver-K',19.10,'san_francisco'),
(1695091554788,'e96c4396-3fad-413a-a942-4cb36106d721','rider-C','driver-M',27.70
,'san_francisco'),
(1695046462179,'9909a8b1-2d15-4d3d-8ec9-efc48c536a00','rider-D','driver-L',33.90
,'san_francisco'),
(1695332066204,'1dced545-862b-4ceb-8b43-d2a568f6616b','rider-E','driver-O',93.50,'san_francisco'),
(1695516137016,'e3cf430c-889d-4015-bc98-59bdce1e530c','rider-F','driver-P',34.15,'sao_paulo'
),
(1695376420876,'7a84095f-737f-40bc-b62f-6b69664712d2','rider-G','driver-Q',43.40
,'sao_paulo' ),
(1695173887231,'3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04','rider-I','driver-S',41.06
,'chennai' ),
(1695115999911,'c8abbe79-8d89-47ea-b4ce-4d224bae5bfa','rider-J','driver-T',17.85,'chennai');
SET hoodie.merge.small.file.group.candidates.limit = 0;
UPDATE table2 SET fare = 20.0 WHERE rider = 'rider-A';
DELETE FROM table2 WHERE uuid = 'e3cf430c-889d-4015-bc98-59bdce1e530c';
select * from table2; {code}
I tested with optimized writes enabled and disabled. When optimized writes are
disabled, there is no warning about position fallback
Here is with optimized writes false:
{code:java}
spark-sql (default)> SET hoodie.spark.sql.optimized.writes.enable = false;
24/11/21 16:11:45 WARN DFSPropertiesConfiguration: Properties file
file:/etc/hudi/conf/hudi-defaults.conf not found. Ignoring to load props file
24/11/21 16:11:45 WARN DFSPropertiesConfiguration: Cannot find HUDI_CONF_DIR,
please set it as the dir of hudi-defaults.conf
hoodie.spark.sql.optimized.writes.enable false
Time taken: 0.764 seconds, Fetched 1 row(s)
spark-sql (default)> CREATE TABLE table2 (
> ts BIGINT,
> uuid STRING,
> rider STRING,
> driver STRING,
> fare DOUBLE,
> city STRING
> ) USING HUDI
> LOCATION 'file:///tmp/testpositions'
> TBLPROPERTIES (
> type = 'mor',
> primaryKey = 'uuid',
> preCombineField = 'ts'
> )
> PARTITIONED BY (city);
24/11/21 16:11:52 WARN TableSchemaResolver: Could not find any data file
written for commit, so could not get schema for table file:/tmp/testpositions
Time taken: 0.384 seconds
spark-sql (default)> INSERT INTO table2
> VALUES
>
(1695159649087,'334e26e9-8355-45cc-97c6-c31daf0df330','rider-A','driver-K',19.10,'san_francisco'),
>
(1695091554788,'e96c4396-3fad-413a-a942-4cb36106d721','rider-C','driver-M',27.70
,'san_francisco'),
>
(1695046462179,'9909a8b1-2d15-4d3d-8ec9-efc48c536a00','rider-D','driver-L',33.90
,'san_francisco'),
>
(1695332066204,'1dced545-862b-4ceb-8b43-d2a568f6616b','rider-E','driver-O',93.50,'san_francisco'),
>
(1695516137016,'e3cf430c-889d-4015-bc98-59bdce1e530c','rider-F','driver-P',34.15,'sao_paulo'
),
>
(1695376420876,'7a84095f-737f-40bc-b62f-6b69664712d2','rider-G','driver-Q',43.40
,'sao_paulo' ),
>
(1695173887231,'3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04','rider-I','driver-S',41.06
,'chennai' ),
>
(1695115999911,'c8abbe79-8d89-47ea-b4ce-4d224bae5bfa','rider-J','driver-T',17.85,'chennai');
24/11/21 16:12:02 WARN TableSchemaResolver: Could not find any data file
written for commit, so could not get schema for table file:/tmp/testpositions
24/11/21 16:12:03 WARN TableSchemaResolver: Could not find any data file
written for commit, so could not get schema for table file:/tmp/testpositions
24/11/21 16:12:05 WARN MetricsConfig: Cannot locate configuration: tried
hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
24/11/21 16:12:05 WARN HoodieBackedTableMetadataWriter: Skipping secondary
index initialization as only one secondary index bootstrap at a time is
supported for now. Provided: []
# WARNING: Unable to attach Serviceability Agent. Unable to attach even with
module exceptions: [org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException:
Sense failed., org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense
failed., org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense
failed.]
24/11/21 16:12:08 WARN HoodieBackedTableMetadataWriter: Skipping secondary
index initialization as only one secondary index bootstrap at a time is
supported for now. Provided: []
Time taken: 5.728 seconds
spark-sql (default)> SET hoodie.merge.small.file.group.candidates.limit = 0;
hoodie.merge.small.file.group.candidates.limit 0
Time taken: 0.012 seconds, Fetched 1 row(s)
spark-sql (default)> UPDATE table2 SET fare = 20.0 WHERE rider = 'rider-A';
24/11/21 16:12:16 WARN SparkStringUtils: Truncated the string representation of
a plan since it was too large. This behavior can be adjusted by setting
'spark.sql.debug.maxToStringFields'.
24/11/21 16:12:16 WARN HoodieFileIndex: Data skipping requires both Metadata
Table and at least one of Column Stats Index, Record Level Index, or Functional
Index to be enabled as well! (isMetadataTableEnabled = false,
isColumnStatsIndexEnabled = false, isRecordIndexApplicable = false,
isFunctionalIndexEnabled = false, isBucketIndexEnable = false,
isPartitionStatsIndexEnabled = false), isBloomFiltersIndexEnabled = false)
24/11/21 16:12:16 WARN HoodieBackedTableMetadataWriter: Skipping secondary
index initialization as only one secondary index bootstrap at a time is
supported for now. Provided: []
24/11/21 16:12:17 WARN HoodieBackedTableMetadataWriter: Skipping secondary
index initialization as only one secondary index bootstrap at a time is
supported for now. Provided: []
Time taken: 1.802 seconds
spark-sql (default)> DELETE FROM table2 WHERE uuid =
'e3cf430c-889d-4015-bc98-59bdce1e530c';
24/11/21 16:12:27 WARN HoodieFileIndex: Data skipping requires both Metadata
Table and at least one of Column Stats Index, Record Level Index, or Functional
Index to be enabled as well! (isMetadataTableEnabled = false,
isColumnStatsIndexEnabled = false, isRecordIndexApplicable = false,
isFunctionalIndexEnabled = false, isBucketIndexEnable = false,
isPartitionStatsIndexEnabled = false), isBloomFiltersIndexEnabled = false)
24/11/21 16:12:27 WARN HoodieBackedTableMetadataWriter: Skipping secondary
index initialization as only one secondary index bootstrap at a time is
supported for now. Provided: []
24/11/21 16:12:27 WARN HoodieBackedTableMetadataWriter: Skipping secondary
index initialization as only one secondary index bootstrap at a time is
supported for now. Provided: []
Time taken: 1.332 seconds
spark-sql (default)> select * from table2;
20241121161203621 20241121161203621_0_0
1dced545-862b-4ceb-8b43-d2a568f6616b city=san_francisco
1ad629cc-6f75-4ac3-bff2-e4f842421f51-0_0-21-67_20241121161203621.parquet
1695332066204 1dced545-862b-4ceb-8b43-d2a568f6616b rider-E driver-O
93.5 san_francisco
20241121161203621 20241121161203621_0_1
e96c4396-3fad-413a-a942-4cb36106d721 city=san_francisco
1ad629cc-6f75-4ac3-bff2-e4f842421f51-0_0-21-67_20241121161203621.parquet
1695091554788 e96c4396-3fad-413a-a942-4cb36106d721 rider-C driver-M
27.7 san_francisco
20241121161203621 20241121161203621_0_2
9909a8b1-2d15-4d3d-8ec9-efc48c536a00 city=san_francisco
1ad629cc-6f75-4ac3-bff2-e4f842421f51-0_0-21-67_20241121161203621.parquet
1695046462179 9909a8b1-2d15-4d3d-8ec9-efc48c536a00 rider-D driver-L
33.9 san_francisco
20241121161216516 20241121161216516_0_1
334e26e9-8355-45cc-97c6-c31daf0df330 city=san_francisco
1ad629cc-6f75-4ac3-bff2-e4f842421f51-0 1695159649087
334e26e9-8355-45cc-97c6-c31daf0df330 rider-A driver-K 20.0
san_francisco
20241121161203621 20241121161203621_1_1
7a84095f-737f-40bc-b62f-6b69664712d2 city=sao_paulo
c06df00f-d40d-42b1-b320-52de6bd05d0e-0_1-21-68_20241121161203621.parquet
1695376420876 7a84095f-737f-40bc-b62f-6b69664712d2 rider-G driver-Q
43.4 sao_paulo
20241121161203621 20241121161203621_2_0
3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04 city=chennai
41db64e9-04c0-4fcb-8378-ce50e0dc7c22-0_2-21-69_20241121161203621.parquet
1695173887231 3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04 rider-I driver-S
41.06 chennai
20241121161203621 20241121161203621_2_1
c8abbe79-8d89-47ea-b4ce-4d224bae5bfa city=chennai
41db64e9-04c0-4fcb-8378-ce50e0dc7c22-0_2-21-69_20241121161203621.parquet
1695115999911 c8abbe79-8d89-47ea-b4ce-4d224bae5bfa rider-J driver-T
17.85 chennai
Time taken: 0.219 seconds, Fetched 7 row(s) {code}
And here it is without setting optimized writes to false, which has a default
of true:
{code:java}
spark-sql (default)> CREATE TABLE table2 ( > ts BIGINT,
> uuid STRING, > rider STRING,
> driver STRING, > fare DOUBLE,
> city STRING > ) USING HUDI
> LOCATION 'file:///tmp/testpositions' > TBLPROPERTIES (
> type = 'mor', > primaryKey =
'uuid', > preCombineField = 'ts' > )
> PARTITIONED BY (city);24/11/21 16:14:20 WARN
DFSPropertiesConfiguration: Properties file
file:/etc/hudi/conf/hudi-defaults.conf not found. Ignoring to load props
file24/11/21 16:14:20 WARN DFSPropertiesConfiguration: Cannot find
HUDI_CONF_DIR, please set it as the dir of hudi-defaults.conf24/11/21 16:14:20
WARN TableSchemaResolver: Could not find any data file written for commit, so
could not get schema for table file:/tmp/testpositionsTime taken: 1.004
secondsspark-sql (default)> INSERT INTO table2 > VALUES
>
(1695159649087,'334e26e9-8355-45cc-97c6-c31daf0df330','rider-A','driver-K',19.10,'san_francisco'),
>
(1695091554788,'e96c4396-3fad-413a-a942-4cb36106d721','rider-C','driver-M',27.70
,'san_francisco'), >
(1695046462179,'9909a8b1-2d15-4d3d-8ec9-efc48c536a00','rider-D','driver-L',33.90
,'san_francisco'), >
(1695332066204,'1dced545-862b-4ceb-8b43-d2a568f6616b','rider-E','driver-O',93.50,'san_francisco'),
>
(1695516137016,'e3cf430c-889d-4015-bc98-59bdce1e530c','rider-F','driver-P',34.15,'sao_paulo'
), >
(1695376420876,'7a84095f-737f-40bc-b62f-6b69664712d2','rider-G','driver-Q',43.40
,'sao_paulo' ), >
(1695173887231,'3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04','rider-I','driver-S',41.06
,'chennai' ), >
(1695115999911,'c8abbe79-8d89-47ea-b4ce-4d224bae5bfa','rider-J','driver-T',17.85,'chennai');24/11/21
16:14:28 WARN TableSchemaResolver: Could not find any data file written for
commit, so could not get schema for table file:/tmp/testpositions24/11/21
16:14:28 WARN TableSchemaResolver: Could not find any data file written for
commit, so could not get schema for table file:/tmp/testpositions24/11/21
16:14:30 WARN MetricsConfig: Cannot locate configuration: tried
hadoop-metrics2-hbase.properties,hadoop-metrics2.properties24/11/21 16:14:31
WARN HoodieBackedTableMetadataWriter: Skipping secondary index initialization
as only one secondary index bootstrap at a time is supported for now. Provided:
[]# WARNING: Unable to attach Serviceability Agent. Unable to attach even with
module exceptions: [org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException:
Sense failed., org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense
failed., org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense
failed.]24/11/21 16:14:33 WARN HoodieBackedTableMetadataWriter: Skipping
secondary index initialization as only one secondary index bootstrap at a time
is supported for now. Provided: []Time taken: 5.734 secondsspark-sql (default)>
SET hoodie.merge.small.file.group.candidates.limit =
0;hoodie.merge.small.file.group.candidates.limit 0Time taken: 0.016 seconds,
Fetched 1 row(s)spark-sql (default)> UPDATE table2 SET fare = 20.0 WHERE rider
= 'rider-A';24/11/21 16:14:41 WARN SparkStringUtils: Truncated the string
representation of a plan since it was too large. This behavior can be adjusted
by setting 'spark.sql.debug.maxToStringFields'.24/11/21 16:14:41 WARN
HoodieFileIndex: Data skipping requires both Metadata Table and at least one of
Column Stats Index, Record Level Index, or Functional Index to be enabled as
well! (isMetadataTableEnabled = false, isColumnStatsIndexEnabled = false,
isRecordIndexApplicable = false, isFunctionalIndexEnabled = false,
isBucketIndexEnable = false, isPartitionStatsIndexEnabled = false),
isBloomFiltersIndexEnabled = false)24/11/21 16:14:41 WARN
HoodieBackedTableMetadataWriter: Skipping secondary index initialization as
only one secondary index bootstrap at a time is supported for now. Provided:
[]24/11/21 16:14:42 WARN HoodieDataBlock: There are records without valid
positions. Skip writing record positions to the data block header.24/11/21
16:14:42 WARN HoodieBackedTableMetadataWriter: Skipping secondary index
initialization as only one secondary index bootstrap at a time is supported for
now. Provided: []Time taken: 1.59 secondsspark-sql (default)> DELETE FROM
table2 WHERE uuid = 'e3cf430c-889d-4015-bc98-59bdce1e530c';24/11/21 16:14:47
WARN HoodieFileIndex: Data skipping requires both Metadata Table and at least
one of Column Stats Index, Record Level Index, or Functional Index to be
enabled as well! (isMetadataTableEnabled = false, isColumnStatsIndexEnabled =
false, isRecordIndexApplicable = false, isFunctionalIndexEnabled = false,
isBucketIndexEnable = false, isPartitionStatsIndexEnabled = false),
isBloomFiltersIndexEnabled = false)24/11/21 16:14:47 WARN
HoodieBackedTableMetadataWriter: Skipping secondary index initialization as
only one secondary index bootstrap at a time is supported for now. Provided:
[]24/11/21 16:14:47 WARN HoodiePositionBasedFileGroupRecordBuffer: No record
position info is found when attempt to do position based merge.24/11/21
16:14:47 WARN HoodiePositionBasedFileGroupRecordBuffer: Falling back to key
based merge for Read24/11/21 16:14:47 WARN HoodieDeleteBlock: There are delete
records without valid positions. Skip writing record positions to the delete
block header.24/11/21 16:14:47 WARN HoodieBackedTableMetadataWriter: Skipping
secondary index initialization as only one secondary index bootstrap at a time
is supported for now. Provided: []Time taken: 1.103 secondsspark-sql (default)>
select * from table2;24/11/21 16:14:53 WARN
HoodiePositionBasedFileGroupRecordBuffer: No record position info is found when
attempt to do position based merge.24/11/21 16:14:53 WARN
HoodiePositionBasedFileGroupRecordBuffer: No record position info is found when
attempt to do position based merge.24/11/21 16:14:53 WARN
HoodiePositionBasedFileGroupRecordBuffer: Falling back to key based merge for
Read24/11/21 16:14:53 WARN HoodiePositionBasedFileGroupRecordBuffer: Falling
back to key based merge for Read20241121161428912 20241121161428912_0_0
1dced545-862b-4ceb-8b43-d2a568f6616b city=san_francisco
cf8f187a-f827-454d-a26f-114e30c519ed-0_0-21-67_20241121161428912.parquet
1695332066204 1dced545-862b-4ceb-8b43-d2a568f6616b rider-E driver-O
93.5 san_francisco20241121161428912 20241121161428912_0_1
e96c4396-3fad-413a-a942-4cb36106d721 city=san_francisco
cf8f187a-f827-454d-a26f-114e30c519ed-0_0-21-67_20241121161428912.parquet
1695091554788 e96c4396-3fad-413a-a942-4cb36106d721 rider-C driver-M
27.7 san_francisco20241121161428912 20241121161428912_0_2
9909a8b1-2d15-4d3d-8ec9-efc48c536a00 city=san_francisco
cf8f187a-f827-454d-a26f-114e30c519ed-0_0-21-67_20241121161428912.parquet
1695046462179 9909a8b1-2d15-4d3d-8ec9-efc48c536a00 rider-D driver-L
33.9 san_francisco20241121161441739 20241121161441739_0_1
334e26e9-8355-45cc-97c6-c31daf0df330 city=san_francisco
cf8f187a-f827-454d-a26f-114e30c519ed-0 1695159649087
334e26e9-8355-45cc-97c6-c31daf0df330 rider-A driver-K 20.0
san_francisco20241121161428912 20241121161428912_1_1
7a84095f-737f-40bc-b62f-6b69664712d2 city=sao_paulo
22b6070f-6c72-4a3d-9fc6-8bac16a7e873-0_1-21-68_20241121161428912.parquet
1695376420876 7a84095f-737f-40bc-b62f-6b69664712d2 rider-G driver-Q
43.4 sao_paulo20241121161428912 20241121161428912_2_0
3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04 city=chennai
878ae75b-bb04-4ed8-8591-8fafc56ed7ba-0_2-21-69_20241121161428912.parquet
1695173887231 3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04 rider-I driver-S
41.06 chennai20241121161428912 20241121161428912_2_1
c8abbe79-8d89-47ea-b4ce-4d224bae5bfa city=chennai
878ae75b-bb04-4ed8-8591-8fafc56ed7ba-0_2-21-69_20241121161428912.parquet
1695115999911 c8abbe79-8d89-47ea-b4ce-4d224bae5bfa rider-J driver-T
17.85 chennaiTime taken: 0.185 seconds, Fetched 7 row(s) {code}
> Spark SQL UPDATE and DELETE should write record positions
> ---------------------------------------------------------
>
> Key: HUDI-8553
> URL: https://issues.apache.org/jira/browse/HUDI-8553
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Y Ethan Guo
> Assignee: Jonathan Vexler
> Priority: Blocker
> Fix For: 1.0.0
>
>
> Though there is no read and write error, Spark SQL UPDATE and DELETE do not
> write record positions to the log files.
> {code:java}
> spark-sql (default)> CREATE TABLE testing_positions.table2 (
> > ts BIGINT,
> > uuid STRING,
> > rider STRING,
> > driver STRING,
> > fare DOUBLE,
> > city STRING
> > ) USING HUDI
> > LOCATION
> 'file:///Users/ethan/Work/tmp/hudi-1.0.0-testing/positional/table2'
> > TBLPROPERTIES (
> > type = 'mor',
> > primaryKey = 'uuid',
> > preCombineField = 'ts'
> > )
> > PARTITIONED BY (city);
> 24/11/16 12:03:26 WARN TableSchemaResolver: Could not find any data file
> written for commit, so could not get schema for table
> file:/Users/ethan/Work/tmp/hudi-1.0.0-testing/positional/table2
> Time taken: 0.4 seconds
> spark-sql (default)> INSERT INTO testing_positions.table2
> > VALUES
> >
> (1695159649087,'334e26e9-8355-45cc-97c6-c31daf0df330','rider-A','driver-K',19.10,'san_francisco'),
> >
> (1695091554788,'e96c4396-3fad-413a-a942-4cb36106d721','rider-C','driver-M',27.70
> ,'san_francisco'),
> >
> (1695046462179,'9909a8b1-2d15-4d3d-8ec9-efc48c536a00','rider-D','driver-L',33.90
> ,'san_francisco'),
> >
> (1695332066204,'1dced545-862b-4ceb-8b43-d2a568f6616b','rider-E','driver-O',93.50,'san_francisco'),
> >
> (1695516137016,'e3cf430c-889d-4015-bc98-59bdce1e530c','rider-F','driver-P',34.15,'sao_paulo'
> ),
> >
> (1695376420876,'7a84095f-737f-40bc-b62f-6b69664712d2','rider-G','driver-Q',43.40
> ,'sao_paulo' ),
> >
> (1695173887231,'3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04','rider-I','driver-S',41.06
> ,'chennai' ),
> >
> (1695115999911,'c8abbe79-8d89-47ea-b4ce-4d224bae5bfa','rider-J','driver-T',17.85,'chennai');
> 24/11/16 12:03:26 WARN TableSchemaResolver: Could not find any data file
> written for commit, so could not get schema for table
> file:/Users/ethan/Work/tmp/hudi-1.0.0-testing/positional/table2
> 24/11/16 12:03:26 WARN TableSchemaResolver: Could not find any data file
> written for commit, so could not get schema for table
> file:/Users/ethan/Work/tmp/hudi-1.0.0-testing/positional/table2
> 24/11/16 12:03:29 WARN log: Updating partition stats fast for: table2_ro
> 24/11/16 12:03:29 WARN log: Updated size to 436166
> 24/11/16 12:03:29 WARN log: Updating partition stats fast for: table2_ro
> 24/11/16 12:03:29 WARN log: Updating partition stats fast for: table2_ro
> 24/11/16 12:03:29 WARN log: Updated size to 436185
> 24/11/16 12:03:29 WARN log: Updated size to 436386
> 24/11/16 12:03:30 WARN log: Updating partition stats fast for: table2_rt
> 24/11/16 12:03:30 WARN log: Updating partition stats fast for: table2_rt
> 24/11/16 12:03:30 WARN log: Updated size to 436166
> 24/11/16 12:03:30 WARN log: Updated size to 436386
> 24/11/16 12:03:30 WARN log: Updating partition stats fast for: table2_rt
> 24/11/16 12:03:30 WARN log: Updated size to 436185
> 24/11/16 12:03:30 WARN log: Updating partition stats fast for: table2
> 24/11/16 12:03:30 WARN log: Updated size to 436166
> 24/11/16 12:03:30 WARN log: Updating partition stats fast for: table2
> 24/11/16 12:03:30 WARN log: Updated size to 436386
> 24/11/16 12:03:30 WARN log: Updating partition stats fast for: table2
> 24/11/16 12:03:30 WARN log: Updated size to 436185
> 24/11/16 12:03:30 WARN HiveConf: HiveConf of name
> hive.internal.ss.authz.settings.applied.marker does not exist
> 24/11/16 12:03:30 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout
> does not exist
> 24/11/16 12:03:30 WARN HiveConf: HiveConf of name hive.stats.retries.wait
> does not exist
> Time taken: 4.843 seconds
> spark-sql (default)>
> > SET hoodie.merge.small.file.group.candidates.limit = 0;
> hoodie.merge.small.file.group.candidates.limit 0
> Time taken: 0.018 seconds, Fetched 1 row(s)
> spark-sql (default)>
> > UPDATE testing_positions.table2 SET fare = 20.0 WHERE
> rider = 'rider-A';
> 24/11/16 12:03:31 WARN SparkStringUtils: Truncated the string representation
> of a plan since it was too large. This behavior can be adjusted by setting
> 'spark.sql.debug.maxToStringFields'.
> 24/11/16 12:03:32 WARN HoodieFileIndex: Data skipping requires both Metadata
> Table and at least one of Column Stats Index, Record Level Index, or
> Functional Index to be enabled as well! (isMetadataTableEnabled = false,
> isColumnStatsIndexEnabled = false, isRecordIndexApplicable = false,
> isFunctionalIndexEnabled = false, isBucketIndexEnable = false,
> isPartitionStatsIndexEnabled = false), isBloomFiltersIndexEnabled = false)
> 24/11/16 12:03:32 WARN HoodieDataBlock: There are records without valid
> positions. Skip writing record positions to the data block header.
> 24/11/16 12:03:34 WARN HiveConf: HiveConf of name
> hive.internal.ss.authz.settings.applied.marker does not exist
> 24/11/16 12:03:34 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout
> does not exist
> 24/11/16 12:03:34 WARN HiveConf: HiveConf of name hive.stats.retries.wait
> does not exist
> Time taken: 5.545 seconds
> spark-sql (default)>
> > DELETE FROM testing_positions.table2 WHERE uuid =
> 'e3cf430c-889d-4015-bc98-59bdce1e530c';
> 24/11/16 12:03:37 WARN HoodieFileIndex: Data skipping requires both Metadata
> Table and at least one of Column Stats Index, Record Level Index, or
> Functional Index to be enabled as well! (isMetadataTableEnabled = false,
> isColumnStatsIndexEnabled = false, isRecordIndexApplicable = false,
> isFunctionalIndexEnabled = false, isBucketIndexEnable = false,
> isPartitionStatsIndexEnabled = false), isBloomFiltersIndexEnabled = false)
> 24/11/16 12:03:37 WARN HoodiePositionBasedFileGroupRecordBuffer: No record
> position info is found when attempt to do position based merge.
> 24/11/16 12:03:37 WARN HoodiePositionBasedFileGroupRecordBuffer: Falling back
> to key based merge for Read
> 24/11/16 12:03:38 WARN HoodieDeleteBlock: There are delete records without
> valid positions. Skip writing record positions to the delete block header.
> 24/11/16 12:03:39 WARN HiveConf: HiveConf of name
> hive.internal.ss.authz.settings.applied.marker does not exist
> 24/11/16 12:03:39 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout
> does not exist
> 24/11/16 12:03:39 WARN HiveConf: HiveConf of name hive.stats.retries.wait
> does not exist
> Time taken: 2.992 seconds
> spark-sql (default)>
> > select * from testing_positions.table2;
> 24/11/16 12:03:41 WARN HoodiePositionBasedFileGroupRecordBuffer: No record
> position info is found when attempt to do position based merge.
> 24/11/16 12:03:41 WARN HoodiePositionBasedFileGroupRecordBuffer: No record
> position info is found when attempt to do position based merge.
> 24/11/16 12:03:41 WARN HoodiePositionBasedFileGroupRecordBuffer: Falling back
> to key based merge for Read
> 24/11/16 12:03:41 WARN HoodiePositionBasedFileGroupRecordBuffer: Falling back
> to key based merge for Read
> 20241116120326527 20241116120326527_0_0
> 1dced545-862b-4ceb-8b43-d2a568f6616b city=san_francisco
> 1ba64ef0-bba2-469e-8ef5-696f8cdbe141-0_0-186-338_20241116120326527.parquet
> 16953320662041dced545-862b-4ceb-8b43-d2a568f6616b rider-E driver-O
> 93.5 san_francisco
> 20241116120326527 20241116120326527_0_1
> e96c4396-3fad-413a-a942-4cb36106d721 city=san_francisco
> 1ba64ef0-bba2-469e-8ef5-696f8cdbe141-0_0-186-338_20241116120326527.parquet
> 1695091554788e96c4396-3fad-413a-a942-4cb36106d721 rider-C driver-M
> 27.7 san_francisco
> 20241116120326527 20241116120326527_0_2
> 9909a8b1-2d15-4d3d-8ec9-efc48c536a00 city=san_francisco
> 1ba64ef0-bba2-469e-8ef5-696f8cdbe141-0_0-186-338_20241116120326527.parquet
> 16950464621799909a8b1-2d15-4d3d-8ec9-efc48c536a00 rider-D driver-L
> 33.9 san_francisco
> 20241116120331896 20241116120331896_0_9
> 334e26e9-8355-45cc-97c6-c31daf0df330 city=san_francisco
> 1ba64ef0-bba2-469e-8ef5-696f8cdbe141-0 1695159649087
> 334e26e9-8355-45cc-97c6-c31daf0df330 rider-A driver-K 20.0
> san_francisco
> 20241116120326527 20241116120326527_1_1
> 7a84095f-737f-40bc-b62f-6b69664712d2 city=sao_paulo
> ba555452-0c3c-47dc-acc0-f90823e12408-0_1-186-339_20241116120326527.parquet
> 1695376420876 7a84095f-737f-40bc-b62f-6b69664712d2 rider-G driver-Q
> 43.4 sao_paulo
> 20241116120326527 20241116120326527_2_0
> 3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04 city=chennai
> 8dacb2f9-6901-4ab3-8139-697b51125f16-0_2-186-340_20241116120326527.parquet
> 1695173887231 3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04 rider-I driver-S
> 41.06 chennai
> 20241116120326527 20241116120326527_2_1
> c8abbe79-8d89-47ea-b4ce-4d224bae5bfa city=chennai
> 8dacb2f9-6901-4ab3-8139-697b51125f16-0_2-186-340_20241116120326527.parquet
> 1695115999911 c8abbe79-8d89-47ea-b4ce-4d224bae5bfa rider-J driver-T
> 17.85 chennai
> Time taken: 1.719 seconds, Fetched 7 row(s) {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)