[
https://issues.apache.org/jira/browse/ARROW-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15649014#comment-15649014
]
Julien Le Dem commented on ARROW-372:
-------------------------------------
The json representation of the schema is definer here:
https://github.com/apache/arrow/blob/master/format/Metadata.md#schemas
example:
{noformat}
"schema" : {
"fields" : [{
"name" : "int",
"nullable" : true,
"type" : {
"name" : "int",
"bitWidth" : 32,
"isSigned" : true
},
"children" : [ ],
"typeLayout" : {
"vectors" : [{
"type" : "VALIDITY",
"typeBitWidth" : 1
},{
"type" : "DATA",
"typeBitWidth" : 32
}]
}
},{
"name" : "bigInt",
"nullable" : true,
"type" : {
"name" : "int",
"bitWidth" : 64,
"isSigned" : true
},
"children" : [ ],
"typeLayout" : {
"vectors" : [{
"type" : "VALIDITY",
"typeBitWidth" : 1
},{
"type" : "DATA",
"typeBitWidth" : 64
}]
}
},{
"name" : "list",
"nullable" : true,
"type" : {
"name" : "list"
},
"children" : [{
"nullable" : true,
"type" : {
"name" : "utf8"
},
"children" : [ ],
"typeLayout" : {
"vectors" : [{
"type" : "VALIDITY",
"typeBitWidth" : 1
},{
"type" : "OFFSET",
"typeBitWidth" : 32
},{
"type" : "DATA",
"typeBitWidth" : 8
}]
}
}],
"typeLayout" : {
"vectors" : [{
"type" : "VALIDITY",
"typeBitWidth" : 1
},{
"type" : "OFFSET",
"typeBitWidth" : 32
}]
}
},{
"name" : "map",
"nullable" : false,
"type" : {
"name" : "struct"
},
"children" : [{
"name" : "timestamp",
"nullable" : true,
"type" : {
"name" : "timestamp",
"unit" : "MILLISECOND"
},
"children" : [ ],
"typeLayout" : {
"vectors" : [{
"type" : "VALIDITY",
"typeBitWidth" : 1
},{
"type" : "DATA",
"typeBitWidth" : 64
}]
}
}],
"typeLayout" : {
"vectors" : [{
"type" : "VALIDITY",
"typeBitWidth" : 1
}]
}
}]
},
{noformat}
> Create JSON arrow file format for integration tests
> ---------------------------------------------------
>
> Key: ARROW-372
> URL: https://issues.apache.org/jira/browse/ARROW-372
> Project: Apache Arrow
> Issue Type: Task
> Components: Java - Vectors
> Reporter: Julien Le Dem
> Assignee: Julien Le Dem
>
> {noformat}
> {
> "schema" : ...,
> "batches" : [{
> "count" : 10,
> "columns" : [
> {
> "name": "{col_name_int}",
> "count" : 10,
> "VALIDITY" : [1,1,1,1,1,1,1,1,1,1],
> "DATA" : [0,1,2,3,4,5,6,7,8,9]
> },
> {
> "name": "{col_name_list}",
> "count" : 10,
> "VALIDITY" : [1,1,1,1,1,1,1,1,1,1],
> "OFFSET" : [0,0,1,3,3,4,6,6,7,9],
> "children" : {
> {
> "name": "child_name",
> "count" : 9,
> "VALIDITY" : [1,1,1,1,1,1,1,1,1,1],
> "OFFSET" : [0,3,6,9,12,15,18,21,24],
> "DATA" : ["abc","abc","abc","abc","abc","abc","abc","abc","abc"]
> }
> }
> },
> {
> "name": "{col_name_map}",
> "count" : 10,
> "VALIDITY" : [1,1,1,1,1,1,1,1,1,1],
> "children" : {
> {
> "name": "{col_name_timestamp}",
> "count" : 10,
> "VALIDITY" : [1,1,1,1,1,1,1,1,1,1],
> "DATA" : [0,1,2,3,4,5,6,7,8,9]
> }
> }
> }
> }, ... ]
> }
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)