Thanks that is what I was missing.
From: Owen O'Malley <[email protected]>
To: [email protected]; Telco Phone <[email protected]>
Sent: Saturday, February 25, 2017 2:02 PM
Subject: Re: Setting Null correctly
On Sat, Feb 25, 2017 at 9:39 AM, Telco Phone <[email protected]> wrote:
Give the code here I am trying to find the correct way to set null to various
vectors
In the case of Long or Bytes vectors, how do you correctly set nulls ?
Lines in question are
col1.isNull[4] = Boolean.TRUE; <--- does not set to null but sets to 0 in
outputcol2.isNull[4] = Boolean.TRUE; <--- throws error on write
It is easier to use "true" instead of "Boolean.TRUE":col1.isNull[4] =
true;col2.isNull[4] = true;
You also need to set ColumnVector.noNulls
http://orc.apache.org/api/hive-storage-api/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.html#noNulls
to false:
col1.noNulls = false;col2.noNulls = false;
.. Owen
Thanks in advance
void example() {
String s = "struct<_col0:bigint,_col1:str ing>";
TypeDescription schema = TypeDescription.fromString(s);
// Build col0
LongColumnVector col1 = new LongColumnVector(5); col1.init();
col1.vector[0] = 9L; col1.vector[1] = 9L; col1.vector[2] =
9L; col1.vector[3] = 9L; col1.isNull[4] = Boolean.TRUE;
// Build col1
BytesColumnVector col2 = new BytesColumnVector(); col2.init();
col2.initBuffer();
byte[] byteString = "Test0".getBytes(); col2.setVal(0,
byteString, 0, byteString.length);
byteString = "Test1".getBytes(); col2.setVal(1, byteString, 0,
byteString.length);
byteString = "Test2".getBytes(); col2.setVal(2, byteString, 0,
byteString.length);
byteString = "Test3".getBytes(); col2.setVal(3, byteString, 0,
byteString.length);
byteString = null;
col2.isNull[4] = Boolean.TRUE;
VectorizedRowBatch batch = schema.createRowBatch();
batch.cols[0] = col1; batch.cols[1] = col2;
batch.size=5;
try { File f = new File("/tmp/my-file.orc");
f.delete();
Configuration conf = new Configuration(); Writer writer
= OrcFile.createWriter(new Path("/tmp/my-file.orc"),
OrcFile.writerOptions(conf).se tSchema(schema));
writer.addRowBatch(batch); writer.close();
} catch (Exception e) { e.printStackTrace(); }}