Cpandey43 opened a new issue, #10770:
URL: https://github.com/apache/hudi/issues/10770
**Describe the problem you faced**
I'm getting messages from Kafka as a JSON object, in which one value
contains an Array[bytes].
When I pushed the same data in the Hudi table, the Array[bytes] values were
added as a NULL.
**To Reproduce**
Steps to reproduce the behavior:
I'm attaching the code & sample data for reproducing it.
**Environment Description**
* Hudi version : 0.13.1
* Spark version : 3.2.4
* Running on Docker? (yes/no) : k8s
**Code (scala)**
`import com.fasterxml.jackson.databind.ObjectMapper
import java.io.{ByteArrayInputStream, ByteArrayOutputStream,
FileInputStream, FileOutputStream, ObjectInputStream, ObjectOutputStream}
class Employee extends java.io.Serializable{
var id: String = ""
var name: String = ""
var department: String= ""
def this(id:String, name: String,department:String){
this()
this.id = id
this.name = name
this.department = department
}
def getId: String = id
def getName:String = name
def getDepartment:String = department
def setId(id:String): Unit ={
this.id = id
}
def setName(name:String): Unit = {
this.name = name
}
def setDepartment(department:String): Unit = {
this.department = department
}
}
val jsonStr = Array(
"{\"id\": \"123\",\"name\":\"xyz\", \"department\":\"Jan\"}",
"{\"id\": \"121\",\"name\":\"abc\", \"department\":\"Jan\"}",
"{\"id\": \"154\",\"name\":\"opq\", \"department\":\"Jan\"}",
"{\"id\": \"187\",\"name\":\"mno\", \"department\":\"Feb\"}",
"{\"id\": \"753\",\"name\":\"hud\", \"department\":\"Feb\"}",
"{\"id\": \"564\",\"name\":\"iow\", \"department\":\"Feb\"}",
"{\"id\": \"874\",\"name\":\"nhq\", \"department\":\"Mar\"}",
"{\"id\": \"876\",\"name\":\"zop\", \"department\":\"Mar\"}",
"{\"id\": \"532\",\"name\":\"fhe\", \"department\":\"Mar\"}",
"{\"id\": \"334\",\"name\":\"oih\", \"department\":\"Apr\"}"
)
val mapper: ObjectMapper = new ObjectMapper()
val fileOutStream = new FileOutputStream("C:\\local-path\\output.ser")
val objectOutputStream = new ObjectOutputStream(fileOutStream)
for (i <- jsonStr) {
val myObject = mapper.readValue(i, classOf[Employee])
objectOutputStream.writeObject(myObject)
}
objectOutputStream.close()
fileOutStream.close()
val fileInputStream = new FileInputStream("C:\\local-path\\output.ser")
val objectInputStream = new ObjectInputStream(fileInputStream)
val employee :Employee =
objectInputStream.readObject().asInstanceOf[Employee]
println(employee.getId+" "+ employee.getName+" "+employee.getDepartment)`
**Output data (array[byte]) **
Warning: As I pasted the data, it may get corrupted. Unable to attach the
output.ser file so attaching it as a .txt file.
[output.txt](https://github.com/apache/hudi/files/14424640/output.txt)
`����sr��$line5.$read$$iw$$iw$Employeel�]�wU�m��L��
departmentt��Ljava/lang/String;L��idq��~��L��nameq��~��xpt��Jant��123t��xyzsq��~����t��Jant��121t��abcsq��~����t��Jant��154t��opqsq��~����t��Febt��187t��mnosq��~����t��Febt��753t��hudsq��~����t��Febt��564t��iowsq��~����t��Mart��874t��nhqsq��~����t��Mart��876t��zopsq��~����t��Mart��532t��fhesq��~����t��Aprt��334t��oih`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]