I had no hand in the design, but it is very elegant and I'll throw in my two cents.
Avro is an interchange format. The in memory representation is entirely up to you and your implementation language of choice. The provided Java implementation, allows for seamless mixing of Generic (everything is Object with some conventions e.g. Strings must be some sort of CharSequence but are generally read as String or Utf8, arrays are handled as java.util.List), Specific (Which allows generated java classes for Record schemas and use of real Java enums for Enum schemas), and Reflect (which allows you to serialize/deserialize regular Java objects via reflection - and incidentally DOES support (de)serialization of native arrays). Since at the Generic level any in memory representation of any schema is an Object (that includes the primitive types which must be boxed and null we could argue about semantically), it would be hard to deal with unboxed primitive array elements anyway. At that point, I don't think there is any real benefit to using native arrays, and as mentioned, java.util.List provides a more flexible interface (note when (not de)serializating any java.util.Collection will do, though it is to your benefit to use one with a defined ordering). Note also that Avro supports object re-use during deserialization which is more likely to be effective with a List implementation (since you can't change the size of an array) Were you really to care (as per my elegant point above) you can implement your own in memory representations (though you'd want to have a pretty good reason, and I'm not suggesting this is one of them). Indeed this is a feature we do use ourselves where for a certain application data type the most natural in memory representation is quite different from the most efficient serialized schema. Avro makes it easy for us to do this without "hacking" anything, though at the cost of implementing a relatively small amount of code, and in our case we only care about it in Java On Sep 24, 2013, at 2:20 PM, Mika Ristimaki <[email protected]> wrote: > > On Sep 24, 2013, at 9:46 PM, Raihan Jamal <[email protected]> wrote: > >> Thanks a lot Mika. Yeah, it works now but my second question is- Does the >> avro schema that I have made looks good as compared to JSON value that we >> were using previously? >> I thought we can use an array for that so designed like that using an Apache >> Avro.. >> > > This is an application design question, and not related to Avro. If you have > a list of prices, array is a good place to store them. > >> And also why Avro Array uses java.util.List datatype? Just curious to know >> on that as well. > > Someone who has actually designed Avro can answer this better, but I assume > that List was chosen because it is much more convenient to use than java > arrays. You don't need to know the size before hand, etc. > > -Mika > >> >> Thanks for the help. >> >> >> >> >> >> >> >> Raihan Jamal >> >> >> On Tue, Sep 24, 2013 at 11:40 AM, Mika Ristimaki <[email protected]> >> wrote: >> Hi, >> >> Avro array uses java.util.List datatype. So you must do something like >> >> List<Double> nums = new ArrayList<Double>(); >> nums.add(new Double(9.97)); >> . >> . >> >> On Sep 24, 2013, at 9:02 PM, Raihan Jamal <[email protected]> wrote: >> >>> Earlier, I was using JSON in our project so one of our attribute data looks >>> like below in JSON format. Below is the attribute `e3` data in JSON format. >>> >>> {"lv":[{"v":{"prc":9.97}},{"v":{"prc":5.56}},{"v":{"prc":21.48}}]} >>> >>> Now, I am planning to use Apache Avro for our Data Serialization format. So >>> I decided to design the Avro schema for the above attributes data. And I >>> came up with the below design. >>> >>> { >>> "namespace": "com.avro.test.AvroExperiment", >>> "type": "record", >>> "name": "AVG_PRICE", >>> "doc": "AVG_PRICE data", >>> "fields": [ >>> {"name": "prc", "type": {"type": "array", "items": "double"}} >>> ] >>> } >>> >>> Now, I am not sure whether the above schema looks right or not >>> corresponding to the values I have in JSON? Can anyone help me on that? >>> Assuming the above schema looks correct, if I try to serialize the data >>> using the above avro schema, I always get the below error- >>> >>> double[] nums = new double[] { 9.97, 5.56, 21.48 }; >>> >>> Schema schema = new >>> Parser().parse((AvroExperiment.class.getResourceAsStream("/aspmc.avsc"))); >>> GenericRecord record = new GenericData.Record(schema); >>> record.put("prc", nums); >>> >>> GenericDatumWriter<GenericRecord> writer = new >>> GenericDatumWriter<GenericRecord>(schema); >>> ByteArrayOutputStream os = new ByteArrayOutputStream(); >>> >>> Encoder e = EncoderFactory.get().binaryEncoder(os, null); >>> >>> // this line gives me exception.. >>> writer.write(record, e); >>> >>> Below is the exception, I always get- >>> >>> Exception in thread "main" java.lang.ClassCastException: [D >>> incompatible with java.util.Collection >>> >>> Any idea what wrong I am doing here? >> >> >
smime.p7s
Description: S/MIME cryptographic signature
