Cpandey43 opened a new issue, #10770:
URL: https://github.com/apache/hudi/issues/10770

   **Describe the problem you faced**
   
   I'm getting messages from Kafka as a JSON object, in which one value 
contains an Array[bytes].
   When I pushed the same data in the Hudi table, the Array[bytes] values were 
added as a NULL.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   I'm attaching the code & sample data for reproducing it.
   
   **Environment Description**
   
   * Hudi version : 0.13.1
   
   * Spark version : 3.2.4
   
   * Running on Docker? (yes/no) : k8s 
   
   **Code (scala)**
   
   `import com.fasterxml.jackson.databind.ObjectMapper
   
   import java.io.{ByteArrayInputStream, ByteArrayOutputStream, 
FileInputStream, FileOutputStream, ObjectInputStream, ObjectOutputStream}
   
   class Employee extends java.io.Serializable{
     var id: String = ""
     var name: String = ""
     var department: String= ""
   
     def this(id:String, name: String,department:String){
       this()
       this.id = id
       this.name = name
       this.department = department
     }
   
     def getId: String = id
     def getName:String = name
     def getDepartment:String = department
   
     def setId(id:String): Unit ={
       this.id = id
     }
   
     def setName(name:String): Unit = {
       this.name = name
     }
   
     def setDepartment(department:String): Unit = {
       this.department = department
     }
   }
   
   val jsonStr = Array(
     "{\"id\": \"123\",\"name\":\"xyz\", \"department\":\"Jan\"}",
     "{\"id\": \"121\",\"name\":\"abc\", \"department\":\"Jan\"}",
     "{\"id\": \"154\",\"name\":\"opq\", \"department\":\"Jan\"}",
     "{\"id\": \"187\",\"name\":\"mno\", \"department\":\"Feb\"}",
     "{\"id\": \"753\",\"name\":\"hud\", \"department\":\"Feb\"}",
     "{\"id\": \"564\",\"name\":\"iow\", \"department\":\"Feb\"}",
     "{\"id\": \"874\",\"name\":\"nhq\", \"department\":\"Mar\"}",
     "{\"id\": \"876\",\"name\":\"zop\", \"department\":\"Mar\"}",
     "{\"id\": \"532\",\"name\":\"fhe\", \"department\":\"Mar\"}",
     "{\"id\": \"334\",\"name\":\"oih\", \"department\":\"Apr\"}"
   )
   val mapper: ObjectMapper = new ObjectMapper()
   val fileOutStream = new FileOutputStream("C:\\local-path\\output.ser")
   val objectOutputStream = new ObjectOutputStream(fileOutStream)
   for (i <- jsonStr) {
     val myObject = mapper.readValue(i, classOf[Employee])
     objectOutputStream.writeObject(myObject)
   }
   objectOutputStream.close()
   fileOutStream.close()
   val fileInputStream = new FileInputStream("C:\\local-path\\output.ser")
   val objectInputStream = new ObjectInputStream(fileInputStream)
   val employee :Employee = 
objectInputStream.readObject().asInstanceOf[Employee]
   
   println(employee.getId+" "+ employee.getName+" "+employee.getDepartment)`
   
   **Output data (array[byte]) **
   
   Warning: As I pasted the data, it may get corrupted.  Unable to attach the 
output.ser file so attaching it as a .txt file.
   [output.txt](https://github.com/apache/hudi/files/14424640/output.txt)
   
   `����sr��$line5.$read$$iw$$iw$Employeel�]�wU�m��L��
   
departmentt��Ljava/lang/String;L��idq��~��L��nameq��~��xpt��Jant��123t��xyzsq��~����t��Jant��121t��abcsq��~����t��Jant��154t��opqsq��~����t��Febt��187t��mnosq��~����t��Febt��753t��hudsq��~����t��Febt��564t��iowsq��~����t��Mart��874t��nhqsq��~����t��Mart��876t��zopsq��~����t��Mart��532t��fhesq��~����t��Aprt��334t��oih`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to