umehrot2 commented on issue #2277:
URL: https://github.com/apache/hudi/issues/2277#issuecomment-737136631


   This seems like you are not able to do a basic upsert. I doubt it has 
anything to do with the fact that you are using glue. Each record is uniquely 
recognized by **record key + partition key**. You may want to debug, by 
printing out the particular row you receive when you do:
   ```
   updateDF = hudiDF.limit(1).withColumn('sequence', 
lit('new_value')).withColumn('upd_ind', lit(1))
   ```
   
   OR
   
   ```
   updateDF = hudiDF.filter(col('ID_key')==64777).withColumn('sequence', 
lit('new_value')).withColumn('upd_ind', lit(1))
   ```
   
   Just verify once that it indeed does have the record key, partition key and 
pre-combine column in it. That particular record should get updated. I don't 
particularly see anything wrong that stands out based on your description.
   
   Which version of Hudi are you using ?
   
   A couple of things that stand out:
   - You are using `org.apache.hudi.keygen.ComplexKeyGenerator` which just by 
looking at your example doesn't seem like you need because you have only one 
column for record key and one columns for partition key. So you may want to 
skip it and let it use `SimpleKeyGenerator` which is the default.
   - Another thing to give a shot may be that once you have the `updateDF` drop 
the hoodie metadata columns from it i.e. that columns that begin with name 
`_hoodie`. I have sometimes noticed weird issues happening if the update data 
frame has metadata columns in it. Most likely this may not be the issue in your 
case, but I would like to rule it out before we proceed.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to