[jira] [Commented] (AVRO-1927) If a default value is set, Avro allows null values in non-nullable fields.
[ https://issues.apache.org/jira/browse/AVRO-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547243#comment-16547243 ] Ryan Blue commented on AVRO-1927: - Sounds like the problem here is actually that there is no validation in the avro-specific builders or in object creation. Avro is doing the right thing and not allowing you to serialize invalid data. I think it makes sense to add null checks to the generated builders. Feel free to fix this and open a PR! > If a default value is set, Avro allows null values in non-nullable fields. > -- > > Key: AVRO-1927 > URL: https://issues.apache.org/jira/browse/AVRO-1927 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.8.1 >Reporter: Andreas Maier >Priority: Major > > With an avro schema like > {code} > { > "name": "myfield", > "type": "string", > "default": "" > } > {code} > the following code should throw an exception > {code} > MyObject myObject = MyObject.newBuilder().setMyfield(null).build(); > {code} > But instead the value of myfield is set to null, which causes an exception > later when serializing myObject, because null is not a valid value for > myfield. > I believe in this case setMyfield(null) should throw an exception, > independent of the value of default. > See also > https://stackoverflow.com/questions/38509279/generated-avro-builder-set-null-doesnt-overwrite-with-default -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AVRO-2164) Make Decimal a first class type.
[ https://issues.apache.org/jira/browse/AVRO-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523943#comment-16523943 ] Ryan Blue commented on AVRO-2164: - First class types are difficult to add because they break the format's forward-compatibility (old readers can't read newer data). I don't think there's a compelling argument to add decimal as a primitive anyway. We can make it work with logical types. Similarly, I don't see a benefit to including this in the serialized form. There's nothing that would achieve that we can't do with the scale encoded in the schema, other than storing values with different scales, which is beyond the scope of the type (because no SQL system supports it). Part of the problem is that we don't have well-defined rules for decimal evolution. Because changing the scale of a value changes the value itself (4.00 is NOT equal to 4.000), I think that at a minimum, decimals should always be returned in the scale they were written with. That would solve many of these problems, right? I'd like to hear ideas for clearly defined rules about what happens when you evolve a decimal. (In Iceberg, we don't allow scale changes at all because of the problems here.) Without a clear set of rules first, I don't think we can confidently make changes. > Make Decimal a first class type. > > > Key: AVRO-2164 > URL: https://issues.apache.org/jira/browse/AVRO-2164 > Project: Avro > Issue Type: Improvement > Components: logical types >Affects Versions: 1.8.2 >Reporter: Andy Coates >Priority: Major > > I'd be interested to hear the communities thoughts on making decimal a first > class type. > The current logical type encodes a decimal into a _bytes_ or _fixed_. This > encoding does not include any information about the scale, i.e. this encoding > is lossy. > There are open issues around the compatibility / evolvability of schemas > containing decimal logical types, (e.g. AVRO-2078 & AVRO-1721), that mean > reading data that was previously written with a different scale will result > in data corruption. > If these issues were fixed, with suitable compatibility checks put in place, > this would then make it impossible to evolve an Avro schema where the scale > needs to be changed. This inability to evolve the scale is very restrictive, > and can result in high overhead for organizations that _need_ to change the > scale, i.e. they may potentially need to copy their entire data set, > deserializing with the old scale and re-serializing with the new. > If _decimal_ were promoted to a first class type, this would allow the scale > to be captured in the serialized form, allow for schema evolution support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AVRO-2164) Make Decimal a first class type.
[ https://issues.apache.org/jira/browse/AVRO-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411698#comment-16411698 ] Ryan Blue commented on AVRO-2164: - The current decimal type is a fixed scale, so you use the same scale for an entire file. This matches the behavior of decimal columns in SQL. I'm not sure what you mean by "provided by Avro out of the box". When you read decimals, the BigDecimal instances have the column's scale. Avro should also verify that all of the decimals passed in have the correct scale and reject them. > Make Decimal a first class type. > > > Key: AVRO-2164 > URL: https://issues.apache.org/jira/browse/AVRO-2164 > Project: Avro > Issue Type: Improvement > Components: logical types >Affects Versions: 1.8.2 >Reporter: Andy Coates >Priority: Major > > I'd be interested to hear the communities thoughts on making decimal a first > class type. > The current logical type encodes a decimal into a _bytes_ or _fixed_. This > encoding does not include any information about the scale, i.e. this encoding > is lossy. > There are open issues around the compatibility / evolvability of schemas > containing decimal logical types, (e.g. AVRO-2078 & AVRO-1721), that mean > reading data that was previously written with a different scale will result > in data corruption. > If these issues were fixed, with suitable compatibility checks put in place, > this would then make it impossible to evolve an Avro schema where the scale > needs to be changed. This inability to evolve the scale is very restrictive, > and can result in high overhead for organizations that _need_ to change the > scale, i.e. they may potentially need to copy their entire data set, > deserializing with the old scale and re-serializing with the new. > If _decimal_ were promoted to a first class type, this would allow the scale > to be captured in the serialized form, allow for schema evolution support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AVRO-2164) Make Decimal a first class type.
[ https://issues.apache.org/jira/browse/AVRO-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411615#comment-16411615 ] Ryan Blue commented on AVRO-2164: - [~hp9000], I don't follow. How is it lossy? What is "kind of" lossy? > Make Decimal a first class type. > > > Key: AVRO-2164 > URL: https://issues.apache.org/jira/browse/AVRO-2164 > Project: Avro > Issue Type: Improvement > Components: logical types >Affects Versions: 1.8.2 >Reporter: Andy Coates >Priority: Major > > I'd be interested to hear the communities thoughts on making decimal a first > class type. > The current logical type encodes a decimal into a _bytes_ or _fixed_. This > encoding does not include any information about the scale, i.e. this encoding > is lossy. > There are open issues around the compatibility / evolvability of schemas > containing decimal logical types, (e.g. AVRO-2078 & AVRO-1721), that mean > reading data that was previously written with a different scale will result > in data corruption. > If these issues were fixed, with suitable compatibility checks put in place, > this would then make it impossible to evolve an Avro schema where the scale > needs to be changed. This inability to evolve the scale is very restrictive, > and can result in high overhead for organizations that _need_ to change the > scale, i.e. they may potentially need to copy their entire data set, > deserializing with the old scale and re-serializing with the new. > If _decimal_ were promoted to a first class type, this would allow the scale > to be captured in the serialized form, allow for schema evolution support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AVRO-2164) Make Decimal a first class type.
[ https://issues.apache.org/jira/browse/AVRO-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411572#comment-16411572 ] Ryan Blue commented on AVRO-2164: - I disagree that the current format is lossy. The decimal scale is stored in the logical type and should be uniform. I think there may be a problem where Avro doesn't reject values with the wrong scale, but that's a simple bug and not a problem. If you want a type that stores the scale of each value with the value itself, then that's a new type of decimal that we can think about adding. But I think it should be a logical type, or else this would require an incompatible change to the format. > Make Decimal a first class type. > > > Key: AVRO-2164 > URL: https://issues.apache.org/jira/browse/AVRO-2164 > Project: Avro > Issue Type: Improvement > Components: logical types >Affects Versions: 1.8.2 >Reporter: Andy Coates >Priority: Major > > I'd be interested to hear the communities thoughts on making decimal a first > class type. > The current logical type encodes a decimal into a _bytes_ or _fixed_. This > encoding does not include any information about the scale, i.e. this encoding > is lossy. > There are open issues around the compatibility / evolvability of schemas > containing decimal logical types, (e.g. AVRO-2078 & AVRO-1721), that mean > reading data that was previously written with a different scale will result > in data corruption. > If these issues were fixed, with suitable compatibility checks put in place, > this would then make it impossible to evolve an Avro schema where the scale > needs to be changed. This inability to evolve the scale is very restrictive, > and can result in high overhead for organizations that _need_ to change the > scale, i.e. they may potentially need to copy their entire data set, > deserializing with the old scale and re-serializing with the new. > If _decimal_ were promoted to a first class type, this would allow the scale > to be captured in the serialized form, allow for schema evolution support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AVRO-1605) Remove Jackson classes from public API
[ https://issues.apache.org/jira/browse/AVRO-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282158#comment-16282158 ] Ryan Blue commented on AVRO-1605: - I don't think Avro should use accessors or friend packages, so I'm -1 on a patch that requires that. I don't think there's a reason why we need to use them. It is convenient to pass parsed json across package boundaries, but we should pass objects that are valid in the public API instead. > Remove Jackson classes from public API > -- > > Key: AVRO-1605 > URL: https://issues.apache.org/jira/browse/AVRO-1605 > Project: Avro > Issue Type: Sub-task > Components: java >Affects Versions: 1.7.8 >Reporter: Tom White >Assignee: Gabor Szadovszky > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AVRO-2021) uuid logical type is not documented
[ https://issues.apache.org/jira/browse/AVRO-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242524#comment-16242524 ] Ryan Blue commented on AVRO-2021: - [~nkollar], thanks for pointing out this issue. I think we should revisit this. Logical types in the spec are defined for additional (and optional) interoperability between Avro implementations and systems that use Avro. We want to generally keep the set of logical types small and well-defined to avoid putting so many requirements out there that nothing actually implements them. The UUID type in Java is an example of this. There is no real benefit to defining the UUID logical type in the spec, unless we want to define how it should be stored. Instead, the UUID logical type in Java is an example of how you can use the logical types API in Java to convert between representations. If we want to define how a UUID should be stored, as a 16-byte fixed, then I think that makes sense. Otherwise, we're defining how UUIDs should be stored for interoperability and specifying a requirement that is both inefficient and obvious. Any interest in updating this to use a 16-byte fixed? If not, I think we should remove this before it makes it into a release. > uuid logical type is not documented > --- > > Key: AVRO-2021 > URL: https://issues.apache.org/jira/browse/AVRO-2021 > Project: Avro > Issue Type: Improvement >Reporter: Andrew Rosca >Assignee: Nandor Kollar >Priority: Minor > Fix For: 1.9.0, 1.8.3 > > Attachments: AVRO-2021_1.patch > > > The documentation does not mention anything about the _uuid_ logical type, > which is in fact implemented in LogicalTypes.java. > Add documentation for _uuid_ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AVRO-1962) Support UUID logical type
[ https://issues.apache.org/jira/browse/AVRO-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16240598#comment-16240598 ] Ryan Blue commented on AVRO-1962: - There is no UUID logical type in the Avro spec, which means that there is no interoperability requirement for UUIDs across Avro implementations or downstream projects. I suggested adding it to Parquet, but that's because Parquet doesn't allow logical types that aren't defined in the spec. Avro, on the other hand, allows logical types that are custom and are used to annotate data that should be transformed. UUID is an example of a type where we don't need interop requirements when storing the value as a String, but it is nice to get the result from deserialization as a Java UUID, or be able to pass a UUID in to be serialized. If we want to start storing UUIDs as 16-byte binary, then we should add UUID to the spec and define how it should be stored as either a String or as big-endian bytes. > Support UUID logical type > - > > Key: AVRO-1962 > URL: https://issues.apache.org/jira/browse/AVRO-1962 > Project: Avro > Issue Type: Bug >Affects Versions: 1.8.1 >Reporter: Tianxiang Xiong > > The AVRO-1554 ticket seems to suggest that the issue of representing UUIDs is > resolved with [logical > types|http://avro.apache.org/docs/1.8.1/spec.html#Logical+Types] in Avro > 1.8.1. However, there is [no UUID logical type in Avro > 1.8.1|https://github.com/apache/avro/blob/release-1.8.1/lang/java/compiler/src/main/javacc/org/apache/avro/compiler/idl/idl.jj#L214-L244]. > The specification offers several examples of using logical types; decimals > are represented as: > {code} > { > "type": "bytes", > "logicalType": "decimal", > "precision": 4, > "scale": 2 > } > {code} > No examples for UUID are offered, presumably because UUIDs are not supported. > Thanks to [~Yibing]'s confirmation on the mailing list that this is the case. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AVRO-2099) Decimal precision is ignored
[ https://issues.apache.org/jira/browse/AVRO-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219208#comment-16219208 ] Ryan Blue commented on AVRO-2099: - I agree, the deserializer shouldn't fail. Decimals with the wrong scale should be rejected when writing, and decimals with precision higher than the max should be rejected when writing. > Decimal precision is ignored > > > Key: AVRO-2099 > URL: https://issues.apache.org/jira/browse/AVRO-2099 > Project: Avro > Issue Type: Improvement >Reporter: Kornel Kiełczewski > > According to the documentation > https://avro.apache.org/docs/1.8.1/spec.html#Decimal > {quote} > The decimal logical type represents an arbitrary-precision signed decimal > number of the form unscaled × 10-scale. > {quote} > Then in the schema we might have an entry like: > {code} > { > "type": "bytes", > "logicalType": "decimal", > "precision": 4, > "scale": 2 > } > {code} > However, in the java deserialization I see that the precision is ignored: > https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/Conversions.java#L79 > {code} > @Override > public BigDecimal fromBytes(ByteBuffer value, Schema schema, LogicalType > type) { > int scale = ((LogicalTypes.Decimal) type).getScale(); > // always copy the bytes out because BigInteger has no offset/length > ctor > byte[] bytes = new byte[value.remaining()]; > value.get(bytes); > return new BigDecimal(new BigInteger(bytes), scale); > } > {code} > The logical type definition in the java api requires the precision to be set: > https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/LogicalTypes.java#L116 > {code} > /** Create a Decimal LogicalType with the given precision and scale */ > public static Decimal decimal(int precision, int scale) { > return new Decimal(precision, scale); > } > {code} > Is this a feature, that we allow arbitrary precision? If so, why do we have > the precision in the API and schema, if it's ignored? > Maybe that's some java specific issue? > Thanks for any hints. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AVRO-2099) Decimal precision is ignored
[ https://issues.apache.org/jira/browse/AVRO-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217132#comment-16217132 ] Ryan Blue commented on AVRO-2099: - With decimals, rounding changes the value. 12.34 is not equal to 12.340. As a storage format, Avro should never change the values it stores, so it can't round to the given precision. It should probably reject values that can't be stored, though. > Decimal precision is ignored > > > Key: AVRO-2099 > URL: https://issues.apache.org/jira/browse/AVRO-2099 > Project: Avro > Issue Type: Improvement >Reporter: Kornel Kiełczewski > > According to the documentation > https://avro.apache.org/docs/1.8.1/spec.html#Decimal > {quote} > The decimal logical type represents an arbitrary-precision signed decimal > number of the form unscaled × 10-scale. > {quote} > Then in the schema we might have an entry like: > {code} > { > "type": "bytes", > "logicalType": "decimal", > "precision": 4, > "scale": 2 > } > {code} > However, in the java deserialization I see that the precision is ignored: > https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/Conversions.java#L79 > {code} > @Override > public BigDecimal fromBytes(ByteBuffer value, Schema schema, LogicalType > type) { > int scale = ((LogicalTypes.Decimal) type).getScale(); > // always copy the bytes out because BigInteger has no offset/length > ctor > byte[] bytes = new byte[value.remaining()]; > value.get(bytes); > return new BigDecimal(new BigInteger(bytes), scale); > } > {code} > The logical type definition in the java api requires the precision to be set: > https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/LogicalTypes.java#L116 > {code} > /** Create a Decimal LogicalType with the given precision and scale */ > public static Decimal decimal(int precision, int scale) { > return new Decimal(precision, scale); > } > {code} > Is this a feature, that we allow arbitrary precision? If so, why do we have > the precision in the API and schema, if it's ignored? > Maybe that's some java specific issue? > Thanks for any hints. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AVRO-1672) Add logical types and conversions for date, time, and timestamp.
[ https://issues.apache.org/jira/browse/AVRO-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16056001#comment-16056001 ] Ryan Blue commented on AVRO-1672: - [~DeaconDesperado], contributions are always welcome! Feel free to open a PR with it. > Add logical types and conversions for date, time, and timestamp. > > > Key: AVRO-1672 > URL: https://issues.apache.org/jira/browse/AVRO-1672 > Project: Avro > Issue Type: Improvement > Components: java >Affects Versions: 1.7.7 >Reporter: Ryan Blue >Assignee: Ryan Blue > Fix For: 1.8.0 > > Attachments: AVRO-1672-1.patch, AVRO-1672-2.patch > > > AVRO-739 added specs for date, time (ms), and timestamp (ms) logical types. > Now that AVRO-1497 has been committed, we should add those new logical types > and conversions for them. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AVRO-2042) Can't reliably write decimal/BigDecimal values to file
[ https://issues.apache.org/jira/browse/AVRO-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050899#comment-16050899 ] Ryan Blue commented on AVRO-2042: - Here's a pretty good primer on Java's BigDecimal type: http://www.opentaps.org/docs/index.php/How_to_Use_Java_BigDecimal:_A_Tutorial > Can't reliably write decimal/BigDecimal values to file > --- > > Key: AVRO-2042 > URL: https://issues.apache.org/jira/browse/AVRO-2042 > Project: Avro > Issue Type: Bug >Affects Versions: 1.8.2 >Reporter: Fred Cohen >Assignee: Ryan Blue > Attachments: AvroDecimal.avsc, AvroDecimalTest.java > > > Attempting to write some decimal values fails. > Here's the schema I created for this test: > { > "namespace":"org.test", > "type": "record", > "name": "DecimalTest", > "fields": [ > { > "name": "decimalVal", > "type": {"type": "bytes", "logicalType":"decimal","precision": 6, > "scale": 4} >} > ] > } > The problem is that to my knowledge I can't control the scale used by > BigDecimal. > While the first value uses a scale of 4, the second one is scale 0. > decimals.add(new BigDecimal(12.01F, mathContext)); > decimals.add(new BigDecimal(12.00F, mathContext)); > It leads to these errors during the write operation: > org.apache.avro.file.DataFileWriter$AppendWriteException: > org.apache.avro.AvroTypeException: Cannot encode decimal with scale 0 as > scale 4 > at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:308) > at AvroDecimalTest.writeDecimalTest(AvroDecimalTest.java:67) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > at > org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) > at org.junit.runners.ParentRunner.run(ParentRunner.java:236) > at org.junit.runner.JUnitCore.run(JUnitCore.java:157) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237) > at > com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) > Caused by: org.apache.avro.AvroTypeException: Cannot encode decimal with > scale 0 as scale 4 > at > org.apache.avro.Conversions$DecimalConversion.toBytes(Conversions.java:92) > at > org.apache.avro.Conversions$DecimalConversion.toBytes(Conversions.java:62) > at org.apache.avro.Conversions.convertToRawType(Conversions.java:218) > at > org.apache.avro.generic.GenericDatumWriter.convert(GenericDatumWriter.java:95) > at > org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:84) > at > org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:156) > at > org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:118) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:75) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:62) > at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:302) > ... 23 more -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (AVRO-2042) Can't reliably write decimal/BigDecimal values to file
[ https://issues.apache.org/jira/browse/AVRO-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue resolved AVRO-2042. - Resolution: Not A Problem Assignee: Ryan Blue > Can't reliably write decimal/BigDecimal values to file > --- > > Key: AVRO-2042 > URL: https://issues.apache.org/jira/browse/AVRO-2042 > Project: Avro > Issue Type: Bug >Affects Versions: 1.8.2 >Reporter: Fred Cohen >Assignee: Ryan Blue > Attachments: AvroDecimal.avsc, AvroDecimalTest.java > > > Attempting to write some decimal values fails. > Here's the schema I created for this test: > { > "namespace":"org.test", > "type": "record", > "name": "DecimalTest", > "fields": [ > { > "name": "decimalVal", > "type": {"type": "bytes", "logicalType":"decimal","precision": 6, > "scale": 4} >} > ] > } > The problem is that to my knowledge I can't control the scale used by > BigDecimal. > While the first value uses a scale of 4, the second one is scale 0. > decimals.add(new BigDecimal(12.01F, mathContext)); > decimals.add(new BigDecimal(12.00F, mathContext)); > It leads to these errors during the write operation: > org.apache.avro.file.DataFileWriter$AppendWriteException: > org.apache.avro.AvroTypeException: Cannot encode decimal with scale 0 as > scale 4 > at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:308) > at AvroDecimalTest.writeDecimalTest(AvroDecimalTest.java:67) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > at > org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) > at org.junit.runners.ParentRunner.run(ParentRunner.java:236) > at org.junit.runner.JUnitCore.run(JUnitCore.java:157) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237) > at > com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) > Caused by: org.apache.avro.AvroTypeException: Cannot encode decimal with > scale 0 as scale 4 > at > org.apache.avro.Conversions$DecimalConversion.toBytes(Conversions.java:92) > at > org.apache.avro.Conversions$DecimalConversion.toBytes(Conversions.java:62) > at org.apache.avro.Conversions.convertToRawType(Conversions.java:218) > at > org.apache.avro.generic.GenericDatumWriter.convert(GenericDatumWriter.java:95) > at > org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:84) > at > org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:156) > at > org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:118) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:75) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:62) > at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:302) > ... 23 more -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AVRO-2042) Can't reliably write decimal/BigDecimal values to file
[ https://issues.apache.org/jira/browse/AVRO-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050746#comment-16050746 ] Ryan Blue commented on AVRO-2042: - The problem is that you're converting floats, which causes the BigDecimal code to guess what the intended scale is. Instead, you should use the String constructor. You can see the tests for an example: https://github.com/apache/avro/blob/master/lang/java/avro/src/test/java/org/apache/avro/generic/TestGenericLogicalTypes.java#L140 Another option is to set the scale using [setScale|https://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html#setScale(int,%20java.math.RoundingMode)] to normalize incoming doubles or floats. > Can't reliably write decimal/BigDecimal values to file > --- > > Key: AVRO-2042 > URL: https://issues.apache.org/jira/browse/AVRO-2042 > Project: Avro > Issue Type: Bug >Affects Versions: 1.8.2 >Reporter: Fred Cohen > Attachments: AvroDecimal.avsc, AvroDecimalTest.java > > > Attempting to write some decimal values fails. > Here's the schema I created for this test: > { > "namespace":"org.test", > "type": "record", > "name": "DecimalTest", > "fields": [ > { > "name": "decimalVal", > "type": {"type": "bytes", "logicalType":"decimal","precision": 6, > "scale": 4} >} > ] > } > The problem is that to my knowledge I can't control the scale used by > BigDecimal. > While the first value uses a scale of 4, the second one is scale 0. > decimals.add(new BigDecimal(12.01F, mathContext)); > decimals.add(new BigDecimal(12.00F, mathContext)); > It leads to these errors during the write operation: > org.apache.avro.file.DataFileWriter$AppendWriteException: > org.apache.avro.AvroTypeException: Cannot encode decimal with scale 0 as > scale 4 > at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:308) > at AvroDecimalTest.writeDecimalTest(AvroDecimalTest.java:67) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > at > org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) > at org.junit.runners.ParentRunner.run(ParentRunner.java:236) > at org.junit.runner.JUnitCore.run(JUnitCore.java:157) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237) > at > com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) > Caused by: org.apache.avro.AvroTypeException: Cannot encode decimal with > scale 0 as scale 4 > at > org.apache.avro.Conversions$DecimalConversion.toBytes(Conversions.java:92) > at > org.apache.avro.Conversions$DecimalConversion.toBytes(Conversions.java:62) > at org.apache.avro.Conversions.convertToRawType(Conversions.java:218) > at > org.apache.avro.generic.GenericDatumWriter.convert(GenericDatumWriter.java:95) > at > org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:84) > at > org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:156) > at > org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:118) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:75) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:62) >
[jira] [Commented] (AVRO-2023) backport and update license fixes for branch-1.7
[ https://issues.apache.org/jira/browse/AVRO-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15997081#comment-15997081 ] Ryan Blue commented on AVRO-2023: - Every release tarball or artifact should have a LICENSE and NOTICE file. If there isn't one for the JS tarball, then we should add it. > backport and update license fixes for branch-1.7 > > > Key: AVRO-2023 > URL: https://issues.apache.org/jira/browse/AVRO-2023 > Project: Avro > Issue Type: Task > Components: build, community >Affects Versions: 1.7.7 >Reporter: Sean Busbey >Priority: Blocker > Fix For: 1.7.8 > > > Backport the changes from AVRO-1722 and subtasks, ensure they're applicable > for branch-1.7. > git history as summary: > {code} > $ git hist --grep AVRO-1722 --grep AVRO-1727 --grep AVRO-1728 --grep > AVRO-1729 --grep AVRO-1730 --grep AVRO-1731 --grep AVRO-1732 --grep AVRO-1733 > --grep AVRO-1734 --grep AVRO-1735 --grep AVRO-1736 --grep AVRO-1771 > origin/branch-1.8 > * 059e6df - AVRO-1771. Add LICENSE and NOTICE to avro-doc artifact. > Contributed by blue. (1 year, 3 months ago) > * ee28a20 - AVRO-1728: Java: Add LICENSE and NOTICE files to jars. (1 year, 4 > months ago) > * 80ba788 - AVRO-1722 ADDENDUM: Add last license doc changes, rat helper. (1 > year, 4 months ago) > * bf751df - AVRO-1722 ADDENDUM: Java: Fix tests broken by adding licenses. (1 > year, 4 months ago) > * f49b2d3 - AVRO-1722: Update root LICENSE.txt and NOTICE.txt. (1 year, 4 > months ago) > * 2c6c570 - AVRO-1730: Python3: Add LICENSE and NOTICE to distribution. (1 > year, 4 months ago) > * e11c588 - AVRO-1730: Python: Add LICENSE and NOTICE to distribution. (1 > year, 4 months ago) > * ada69ac - AVRO-1736: PHP: Add LICENSE and NOTICE to distribution. (1 year, > 4 months ago) > * 96d2d35 - AVRO-1735: Perl: Add LICENSE and NOTICE to distribution. (1 year, > 4 months ago) > * df5c513 - AVRO-1731: C: Add LICENSE and NOTICE to binary distribution. (1 > year, 4 months ago) > * 3585c45 - AVRO-1733: C#: Add LICENSE and NOTICE to binary distribution. (1 > year, 6 months ago) > * 5cb83db - AVRO-1732: C++: Add LICENSE and NOTICE to binary distribution. (1 > year, 6 months ago) > * bc19e3a - AVRO-1729: Ruby: Add LICENSE and NOTICE to ruby gems. (1 year, 6 > months ago) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (AVRO-2023) backport and update license fixes for branch-1.7
[ https://issues.apache.org/jira/browse/AVRO-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15991065#comment-15991065 ] Ryan Blue commented on AVRO-2023: - You can add the license files to the RAT excludes. For the JS files, just make sure the LICENSE file is updated and that the project doesn't require anything placed in NOTICE for the licensing used for the 1.7 version. > backport and update license fixes for branch-1.7 > > > Key: AVRO-2023 > URL: https://issues.apache.org/jira/browse/AVRO-2023 > Project: Avro > Issue Type: Task > Components: build, community >Affects Versions: 1.7.7 >Reporter: Sean Busbey >Priority: Blocker > Fix For: 1.7.8 > > > Backport the changes from AVRO-1722 and subtasks, ensure they're applicable > for branch-1.7. > git history as summary: > {code} > $ git hist --grep AVRO-1722 --grep AVRO-1727 --grep AVRO-1728 --grep > AVRO-1729 --grep AVRO-1730 --grep AVRO-1731 --grep AVRO-1732 --grep AVRO-1733 > --grep AVRO-1734 --grep AVRO-1735 --grep AVRO-1736 --grep AVRO-1771 > origin/branch-1.8 > * 059e6df - AVRO-1771. Add LICENSE and NOTICE to avro-doc artifact. > Contributed by blue. (1 year, 3 months ago) > * ee28a20 - AVRO-1728: Java: Add LICENSE and NOTICE files to jars. (1 year, 4 > months ago) > * 80ba788 - AVRO-1722 ADDENDUM: Add last license doc changes, rat helper. (1 > year, 4 months ago) > * bf751df - AVRO-1722 ADDENDUM: Java: Fix tests broken by adding licenses. (1 > year, 4 months ago) > * f49b2d3 - AVRO-1722: Update root LICENSE.txt and NOTICE.txt. (1 year, 4 > months ago) > * 2c6c570 - AVRO-1730: Python3: Add LICENSE and NOTICE to distribution. (1 > year, 4 months ago) > * e11c588 - AVRO-1730: Python: Add LICENSE and NOTICE to distribution. (1 > year, 4 months ago) > * ada69ac - AVRO-1736: PHP: Add LICENSE and NOTICE to distribution. (1 year, > 4 months ago) > * 96d2d35 - AVRO-1735: Perl: Add LICENSE and NOTICE to distribution. (1 year, > 4 months ago) > * df5c513 - AVRO-1731: C: Add LICENSE and NOTICE to binary distribution. (1 > year, 4 months ago) > * 3585c45 - AVRO-1733: C#: Add LICENSE and NOTICE to binary distribution. (1 > year, 6 months ago) > * 5cb83db - AVRO-1732: C++: Add LICENSE and NOTICE to binary distribution. (1 > year, 6 months ago) > * bc19e3a - AVRO-1729: Ruby: Add LICENSE and NOTICE to ruby gems. (1 year, 6 > months ago) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (AVRO-1684) Add date, time, and timestamp to specific object model classes
[ https://issues.apache.org/jira/browse/AVRO-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15945493#comment-15945493 ] Ryan Blue commented on AVRO-1684: - We should definitely mark 1.8.1 as incompatible. Is it possible to fix the parser for future releases? > Add date, time, and timestamp to specific object model classes > -- > > Key: AVRO-1684 > URL: https://issues.apache.org/jira/browse/AVRO-1684 > Project: Avro > Issue Type: New Feature > Components: java >Affects Versions: 1.7.7 >Reporter: Ryan Blue >Assignee: Ryan Blue > Fix For: 1.8.1 > > Attachments: AVRO-1684.1.patch > > > AVRO-1672 adds conversions for date, time, and timestamp. These should be > available to specific classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (AVRO-1738) add java tool for outputting schema fingerprints
[ https://issues.apache.org/jira/browse/AVRO-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878844#comment-15878844 ] Ryan Blue commented on AVRO-1738: - The content in NOTICE should be in LICENSE as a third-party work, and anything in NOTICE from the original project that applies to the section that was copied should be included in the Avro NOTICE. I don't think we need to update headers since the content is already licensed to the ASF, but I think it is reasonable to keep the comment about where the code came from. > add java tool for outputting schema fingerprints > > > Key: AVRO-1738 > URL: https://issues.apache.org/jira/browse/AVRO-1738 > Project: Avro > Issue Type: New Feature > Components: java >Reporter: Sean Busbey >Assignee: Sean Busbey > Fix For: 1.9.0 > > Attachments: AVRO-1738.1.patch > > > over in AVRO-1694 I wanted to quickly check that the Java library came up > with the same md5/sha fingerprint for some shcemas that the proposed Ruby > implementation does. > I noticed we don't have a tool that exposes the functionality yet, which > seems like a commonly useful thing to do. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (AVRO-1605) Remove Jackson classes from public API
[ https://issues.apache.org/jira/browse/AVRO-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725964#comment-15725964 ] Ryan Blue commented on AVRO-1605: - I think the second option is the right one. Accessors can be used as an intermediate step, but I don't think we should have a release with them. At this point, I haven't seen a compelling case where accessors are preferable to using the public API. Why should default values be passed as JsonNode when they could be Java objects? Is there something I'm missing there? > Remove Jackson classes from public API > -- > > Key: AVRO-1605 > URL: https://issues.apache.org/jira/browse/AVRO-1605 > Project: Avro > Issue Type: Sub-task > Components: java >Affects Versions: 1.7.8 >Reporter: Tom White >Assignee: Gabor Szadovszky > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1605) Remove Jackson classes from public API
[ https://issues.apache.org/jira/browse/AVRO-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722949#comment-15722949 ] Ryan Blue commented on AVRO-1605: - To keep a patch small, we can break it across multiple issues. I'd rather do that than commit changes that should be removed later. To avoid regressions, we should add tests until we are confident we can make changes safely. Otherwise, we're relying on a "don't touch it" strategy for correctness. I don't see a need for the accessors, but I could be convinced otherwise. I just haven't been convinced there's a case that justifies the use. > Remove Jackson classes from public API > -- > > Key: AVRO-1605 > URL: https://issues.apache.org/jira/browse/AVRO-1605 > Project: Avro > Issue Type: Sub-task > Components: java >Affects Versions: 1.7.8 >Reporter: Tom White >Assignee: Gabor Szadovszky > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1885) Release 1.8.2
[ https://issues.apache.org/jira/browse/AVRO-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722928#comment-15722928 ] Ryan Blue commented on AVRO-1885: - Another that would be nice to get in: AVRO-1957 > Release 1.8.2 > - > > Key: AVRO-1885 > URL: https://issues.apache.org/jira/browse/AVRO-1885 > Project: Avro > Issue Type: Task > Components: community >Affects Versions: 1.8.2 >Reporter: Sean Busbey >Assignee: Ryan Blue > Fix For: 1.8.2 > > > Please link to any issues that should be considered blockers for the 1.8.2 > release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1885) Release 1.8.2
[ https://issues.apache.org/jira/browse/AVRO-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712519#comment-15712519 ] Ryan Blue commented on AVRO-1885: - I agree, we'll get those in the next RC. > Release 1.8.2 > - > > Key: AVRO-1885 > URL: https://issues.apache.org/jira/browse/AVRO-1885 > Project: Avro > Issue Type: Task > Components: community >Affects Versions: 1.8.2 >Reporter: Sean Busbey >Assignee: Ryan Blue > Fix For: 1.8.2 > > > Please link to any issues that should be considered blockers for the 1.8.2 > release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712516#comment-15712516 ] Ryan Blue commented on AVRO-1704: - You mean erring on the side of caution and using a larger hash? I don't think collisions with a 64-bit fingerprint are likely enough to cause any trouble. And, while you don't calculate the fingerprint every time, you do send it in the message. > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement > Components: java, spec >Reporter: Daniel Schierbeck >Assignee: Niels Basjes > Fix For: 1.9.0, 1.8.3 > > Attachments: AVRO-1704-2016-05-03-Unfinished.patch, > AVRO-1704-20160410.patch, AVRO-1704.3.patch, AVRO-1704.4.patch > > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712508#comment-15712508 ] Ryan Blue commented on AVRO-1704: - With a spec like this, we want to be careful about having too many things that must be implemented. I think there would have to be a very good reason to add additional hashes to the spec. If you're interested in using the Avro MessageEncoder and MessageDecoder, then that shouldn't be too difficult because the code is modular enough you can implement a decoder for your message format fairly easily. > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement > Components: java, spec >Reporter: Daniel Schierbeck >Assignee: Niels Basjes > Fix For: 1.9.0, 1.8.3 > > Attachments: AVRO-1704-2016-05-03-Unfinished.patch, > AVRO-1704-20160410.patch, AVRO-1704.3.patch, AVRO-1704.4.patch > > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1957) TimeConversions do not implement getRecommendedSchema()
[ https://issues.apache.org/jira/browse/AVRO-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709886#comment-15709886 ] Ryan Blue commented on AVRO-1957: - I'll review this for inclusion. Thanks for following up. > TimeConversions do not implement getRecommendedSchema() > --- > > Key: AVRO-1957 > URL: https://issues.apache.org/jira/browse/AVRO-1957 > Project: Avro > Issue Type: Bug >Affects Versions: 1.8.1 >Reporter: Sean Timm > > org.apache.avro.data.TimeConversions.TimestampConversion and other date and > time conversions do not implement getRecommendedSchema(). When trying to > dynamically generate an Avro schema from a pojo that contains a DateTime > object using ReflectData, I get an unsupported operation exception. > I think the implementation should be as simple as > {code} > @Override > public Schema getRecommendedSchema() { > return > LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG)); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1950) Better Json serialization for Avro decimal logical types?
[ https://issues.apache.org/jira/browse/AVRO-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15644670#comment-15644670 ] Ryan Blue commented on AVRO-1950: - I think String is the right encoding for Decimal in JSON. Otherwise, the JSON parser will produce a float or a double, which will be an approximation and not the exact decimal value. > Better Json serialization for Avro decimal logical types? > - > > Key: AVRO-1950 > URL: https://issues.apache.org/jira/browse/AVRO-1950 > Project: Avro > Issue Type: Improvement >Reporter: Zoltan Farkas >Priority: Minor > > Currently as I understand decimal logical types are encoded on top of bytes > and fixed avro types. This makes them a bit "unnatural" in the json > encoding... > I worked around a hack in my fork to naturally encode them into json > decimals. A good starting point to look at is in: > https://github.com/zolyfarkas/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/DecimalEncoder.java > > My approach is a bit hacky, so I would be interested in suggestions to have > this closer to something we can integrate into avro... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1856) Update ConcatTool to support appending to an input file instead of creating a new output file
[ https://issues.apache.org/jira/browse/AVRO-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1856: Fix Version/s: (was: 1.8.2) > Update ConcatTool to support appending to an input file instead of creating a > new output file > - > > Key: AVRO-1856 > URL: https://issues.apache.org/jira/browse/AVRO-1856 > Project: Avro > Issue Type: Improvement > Components: java >Affects Versions: 1.7.7, 1.8.1 >Reporter: Mike Hurley > Fix For: 1.9.0 > > Attachments: AVRO-1856-ConcatTool-AppendToInput.diff > > > It would be nice to have ConcatTool be able to append other input files into > one of the input files instead of creating a new output file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1950) Better Json serialization for Avro decimal logical types?
[ https://issues.apache.org/jira/browse/AVRO-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15642430#comment-15642430 ] Ryan Blue commented on AVRO-1950: - I've been thinking about this and I think maybe what we should do is to add an encoding for Decimal that is a more natural fit for people using JSON. Right now, you can store a decimal as bytes or fixed, but that doesn't work well as you mentioned. But if we added a String type as well, then that would be suited for JSON and wouldn't have a compatibility problem with implementations that don't have logical types. This would require adding a String encoding for decimal to the spec and toString / fromString methods to the DecimalConversion class. > Better Json serialization for Avro decimal logical types? > - > > Key: AVRO-1950 > URL: https://issues.apache.org/jira/browse/AVRO-1950 > Project: Avro > Issue Type: Improvement >Reporter: Zoltan Farkas >Priority: Minor > > Currently as I understand decimal logical types are encoded on top of bytes > and fixed avro types. This makes them a bit "unnatural" in the json > encoding... > I worked around a hack in my fork to naturally encode them into json > decimals. A good starting point to look at is in: > https://github.com/zolyfarkas/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/DecimalEncoder.java > > My approach is a bit hacky, so I would be interested in suggestions to have > this closer to something we can integrate into avro... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1626) Missing lang/csharp/src/apache/perf/app.config
[ https://issues.apache.org/jira/browse/AVRO-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1626: Fix Version/s: (was: 1.8.2) 1.9.0 > Missing lang/csharp/src/apache/perf/app.config > -- > > Key: AVRO-1626 > URL: https://issues.apache.org/jira/browse/AVRO-1626 > Project: Avro > Issue Type: Bug > Components: csharp >Reporter: Niels Basjes > Fix For: 1.9.0 > > > This error is output during the build > {code} > Target _CopyAppConfigFile: > /usr/lib/mono/4.5/Microsoft.Common.targets: error : Cannot copy > /home/nbasjes/avro/lang/csharp/src/apache/perf/app.config to > /home/nbasjes/avro/lang/csharp/build/perf/Release/Avro.perf.exe.config, as > the source file doesn't exist. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (AVRO-1626) Missing lang/csharp/src/apache/perf/app.config
[ https://issues.apache.org/jira/browse/AVRO-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue reopened AVRO-1626: - This patch caused the 1.8.2 tests to fail in docker. I'm pushing it out of the 1.8.2 release, we can fix it for 1.8.3 or 1.9.0. > Missing lang/csharp/src/apache/perf/app.config > -- > > Key: AVRO-1626 > URL: https://issues.apache.org/jira/browse/AVRO-1626 > Project: Avro > Issue Type: Bug > Components: csharp >Reporter: Niels Basjes > Fix For: 1.9.0 > > > This error is output during the build > {code} > Target _CopyAppConfigFile: > /usr/lib/mono/4.5/Microsoft.Common.targets: error : Cannot copy > /home/nbasjes/avro/lang/csharp/src/apache/perf/app.config to > /home/nbasjes/avro/lang/csharp/build/perf/Release/Avro.perf.exe.config, as > the source file doesn't exist. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (AVRO-1951) Python tests fail because dummyserver.net no longer exists.
[ https://issues.apache.org/jira/browse/AVRO-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue resolved AVRO-1951. - Resolution: Fixed > Python tests fail because dummyserver.net no longer exists. > --- > > Key: AVRO-1951 > URL: https://issues.apache.org/jira/browse/AVRO-1951 > Project: Avro > Issue Type: Bug > Components: python >Affects Versions: 1.8.2 >Reporter: Ryan Blue >Assignee: Ryan Blue > Fix For: 1.8.2 > > > Python's test_ipc.py uses dummyserver.net to create an IPC client and then > check that it has the right configuration. The client doesn't interact with > the endpoint, but the endpoint has to be a valid DNS name and dummyserver.net > is no longer resolving. Updating it to a real DNS name fixes the test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AVRO-1951) Python tests fail because dummyserver.net no longer exists.
Ryan Blue created AVRO-1951: --- Summary: Python tests fail because dummyserver.net no longer exists. Key: AVRO-1951 URL: https://issues.apache.org/jira/browse/AVRO-1951 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.8.2 Reporter: Ryan Blue Assignee: Ryan Blue Fix For: 1.8.2 Python's test_ipc.py uses dummyserver.net to create an IPC client and then check that it has the right configuration. The client doesn't interact with the endpoint, but the endpoint has to be a valid DNS name and dummyserver.net is no longer resolving. Updating it to a real DNS name fixes the test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1897) clean checkout fails when following BUILD.txt instructions for test
[ https://issues.apache.org/jira/browse/AVRO-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1897: Resolution: Fixed Status: Resolved (was: Patch Available) Merged. Thanks for fixing this, [~sacharya]! > clean checkout fails when following BUILD.txt instructions for test > --- > > Key: AVRO-1897 > URL: https://issues.apache.org/jira/browse/AVRO-1897 > Project: Avro > Issue Type: Bug > Components: build, python >Affects Versions: 1.8.1 >Reporter: Sean Busbey >Assignee: Suraj Acharya >Priority: Blocker > Fix For: 1.8.2 > > Attachments: AVRO-1897.2.patch, AVRO-1897.patch, AVRO-1897.patch.1 > > > Clean checkout of branch-1.8 run in docker mode fails {{./build.sh test}} on > the python module because it can't find a copy of the avro-tools jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1907) Add logging to logical type handling in java library
[ https://issues.apache.org/jira/browse/AVRO-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1907: Resolution: Fixed Fix Version/s: 1.8.2 Status: Resolved (was: Patch Available) Merged #130 from [~gszadovszky]. Thanks! > Add logging to logical type handling in java library > > > Key: AVRO-1907 > URL: https://issues.apache.org/jira/browse/AVRO-1907 > Project: Avro > Issue Type: Improvement > Components: java >Affects Versions: 1.8.0 >Reporter: Sean Busbey >Assignee: Gabor Szadovszky > Fix For: 1.8.2 > > > Right now we don't have any logging while handling logical type information > in a Schema. In particular, we use {{LogicalTypes.fromSchemaIgnoreInvalid}} > which means when folks have a problem in their schema logical types just > disappear with no debugging information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1626) Missing lang/csharp/src/apache/perf/app.config
[ https://issues.apache.org/jira/browse/AVRO-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15640463#comment-15640463 ] Ryan Blue commented on AVRO-1626: - Merged the fix from Naruto. I don't see a JIRA account, so I'm leaving this unassigned. Thanks Naruto! > Missing lang/csharp/src/apache/perf/app.config > -- > > Key: AVRO-1626 > URL: https://issues.apache.org/jira/browse/AVRO-1626 > Project: Avro > Issue Type: Bug > Components: csharp >Reporter: Niels Basjes > Fix For: 1.8.2 > > > This error is output during the build > {code} > Target _CopyAppConfigFile: > /usr/lib/mono/4.5/Microsoft.Common.targets: error : Cannot copy > /home/nbasjes/avro/lang/csharp/src/apache/perf/app.config to > /home/nbasjes/avro/lang/csharp/build/perf/Release/Avro.perf.exe.config, as > the source file doesn't exist. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (AVRO-1626) Missing lang/csharp/src/apache/perf/app.config
[ https://issues.apache.org/jira/browse/AVRO-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue resolved AVRO-1626. - Resolution: Fixed Fix Version/s: 1.8.2 > Missing lang/csharp/src/apache/perf/app.config > -- > > Key: AVRO-1626 > URL: https://issues.apache.org/jira/browse/AVRO-1626 > Project: Avro > Issue Type: Bug > Components: csharp >Reporter: Niels Basjes > Fix For: 1.8.2 > > > This error is output during the build > {code} > Target _CopyAppConfigFile: > /usr/lib/mono/4.5/Microsoft.Common.targets: error : Cannot copy > /home/nbasjes/avro/lang/csharp/src/apache/perf/app.config to > /home/nbasjes/avro/lang/csharp/build/perf/Release/Avro.perf.exe.config, as > the source file doesn't exist. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1882) ConcurrentHashMap with non-string keys fails in Java 1.8
[ https://issues.apache.org/jira/browse/AVRO-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1882: Assignee: Sachin Goyal > ConcurrentHashMap with non-string keys fails in Java 1.8 > > > Key: AVRO-1882 > URL: https://issues.apache.org/jira/browse/AVRO-1882 > Project: Avro > Issue Type: Bug >Affects Versions: 1.8.1 >Reporter: Sachin Goyal >Assignee: Sachin Goyal > Fix For: 1.8.2 > > Attachments: TestNonStringConcurrentMap.java > > > Support for ConcurrentHashMaps with non-string keys seems to be broken when > 1.8 version of Java is used because the newer ConcurrentHashMap uses the > names "key" and "val" instead of "key" and "values" for its Map.Entry class. > [HashEntry in > 1.7|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/concurrent/ConcurrentHashMap.java#218] > [MapEntry in > 1.8|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/java/util/concurrent/ConcurrentHashMap.java?av=h#3468] > Hence avro-code that assumes the presence of key/value breaks. > ([ReflectData.java:L434-L443|https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java#L434-L443]) > Run the attached test to see the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1605) Remove Jackson classes from public API
[ https://issues.apache.org/jira/browse/AVRO-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15620519#comment-15620519 ] Ryan Blue commented on AVRO-1605: - This is looking better, but I'd like to see more justification for the accessors that are still there. For example, why do you need the Field accessor? Why can't the default values be handled as objects? > Remove Jackson classes from public API > -- > > Key: AVRO-1605 > URL: https://issues.apache.org/jira/browse/AVRO-1605 > Project: Avro > Issue Type: Sub-task > Components: java >Affects Versions: 1.7.8 >Reporter: Tom White >Assignee: Gabor Szadovszky > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1945) Python float deviation
[ https://issues.apache.org/jira/browse/AVRO-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15620467#comment-15620467 ] Ryan Blue commented on AVRO-1945: - Your problem has to do with how values are represented. It isn't that Java handles float values well because it does the exact same thing that Python does if you don't round when you print it out: {code:title=scala/java} scala> String.format("%2.15f", Float.valueOf(1.9)) res0: String = 1.89976158142 {code} 1.9 simply can't be represented exactly as an [IEEE754|https://en.wikipedia.org/wiki/Single-precision_floating-point_format] float. > Python float deviation > -- > > Key: AVRO-1945 > URL: https://issues.apache.org/jira/browse/AVRO-1945 > Project: Avro > Issue Type: Bug > Components: python >Affects Versions: 1.8.1 > Environment: Python 2.7.11 with avro 1.8.1 > Python 3.5.2 with avro-python3 1.8.1 >Reporter: Stephan Müller > > Unfortunately, the python avro package seems to have problems with float > numbers. > After encoding data containing float values into an avro file and decoding it > back, values with decimals differ a tiny bit from the original value. > In the following code sequence, the number 1.9 is saved as float. After > decoding it back, the shown value is 1.89976158142. > {code:none} > import avro.schema > from avro.datafile import DataFileReader, DataFileWriter > from avro.io import DatumReader, DatumWriter > schema_text = """{"namespace": "example.avro", > "type": "record", > "name": "Number", > "fields": [ >{"name": "name", "type": "string"}, >{"name": "number", "type": "float"} > ] > }""" > schema = avro.schema.parse(schema_text) > writer = DataFileWriter(open("numbers.avro", "wb"), DatumWriter(), schema) > writer.append({"name": "Float number with one decimal", "number": 1.9}) > writer.close() > reader = DataFileReader(open("numbers.avro", "rb"), DatumReader()) > for user in reader: > print(user) > reader.close() > {code} > Script output: > {code:none} > {u'name': u'Float number with one decimal', u'number': 1.89976158142} > {code} > Using avro-tools-1.8.1.jar to decode the same created avro file > (numbers.avro), the displayed floating numbers correspond to the original > values: > {code:none} > $ java -jar avro-tools-1.8.1.jar tojson numbers.avro > {"name":"Float number with one decimal","number":{"float":1.9}} > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (AVRO-1945) Python float deviation
[ https://issues.apache.org/jira/browse/AVRO-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue resolved AVRO-1945. - Resolution: Not A Bug > Python float deviation > -- > > Key: AVRO-1945 > URL: https://issues.apache.org/jira/browse/AVRO-1945 > Project: Avro > Issue Type: Bug > Components: python >Affects Versions: 1.8.1 > Environment: Python 2.7.11 with avro 1.8.1 > Python 3.5.2 with avro-python3 1.8.1 >Reporter: Stephan Müller > > Unfortunately, the python avro package seems to have problems with float > numbers. > After encoding data containing float values into an avro file and decoding it > back, values with decimals differ a tiny bit from the original value. > In the following code sequence, the number 1.9 is saved as float. After > decoding it back, the shown value is 1.89976158142. > {code:none} > import avro.schema > from avro.datafile import DataFileReader, DataFileWriter > from avro.io import DatumReader, DatumWriter > schema_text = """{"namespace": "example.avro", > "type": "record", > "name": "Number", > "fields": [ >{"name": "name", "type": "string"}, >{"name": "number", "type": "float"} > ] > }""" > schema = avro.schema.parse(schema_text) > writer = DataFileWriter(open("numbers.avro", "wb"), DatumWriter(), schema) > writer.append({"name": "Float number with one decimal", "number": 1.9}) > writer.close() > reader = DataFileReader(open("numbers.avro", "rb"), DatumReader()) > for user in reader: > print(user) > reader.close() > {code} > Script output: > {code:none} > {u'name': u'Float number with one decimal', u'number': 1.89976158142} > {code} > Using avro-tools-1.8.1.jar to decode the same created avro file > (numbers.avro), the displayed floating numbers correspond to the original > values: > {code:none} > $ java -jar avro-tools-1.8.1.jar tojson numbers.avro > {"name":"Float number with one decimal","number":{"float":1.9}} > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1897) clean checkout fails when following BUILD.txt instructions for test
[ https://issues.apache.org/jira/browse/AVRO-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576043#comment-15576043 ] Ryan Blue commented on AVRO-1897: - Looks fine to me. > clean checkout fails when following BUILD.txt instructions for test > --- > > Key: AVRO-1897 > URL: https://issues.apache.org/jira/browse/AVRO-1897 > Project: Avro > Issue Type: Bug > Components: build, python >Affects Versions: 1.8.1 >Reporter: Sean Busbey >Assignee: Suraj Acharya >Priority: Blocker > Fix For: 1.8.2 > > Attachments: AVRO-1897.2.patch, AVRO-1897.patch, AVRO-1897.patch.1 > > > Clean checkout of branch-1.8 run in docker mode fails {{./build.sh test}} on > the python module because it can't find a copy of the avro-tools jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1885) Release 1.8.2
[ https://issues.apache.org/jira/browse/AVRO-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572177#comment-15572177 ] Ryan Blue commented on AVRO-1885: - [~busbey], are we close to a release candidate? I don't think any of the linked issues are actually blockers: * AVRO-1897 has a work-around and is minor * AVRO-1856 is a new feature that would be nice, but isn't a blocker * AVRO-1932 is also nice to have, but not a blocker > Release 1.8.2 > - > > Key: AVRO-1885 > URL: https://issues.apache.org/jira/browse/AVRO-1885 > Project: Avro > Issue Type: Task > Components: community >Affects Versions: 1.8.2 >Reporter: Sean Busbey >Assignee: Sean Busbey > Fix For: 1.8.2 > > > Please link to any issues that should be considered blockers for the 1.8.2 > release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1897) clean checkout fails when following BUILD.txt instructions for test
[ https://issues.apache.org/jira/browse/AVRO-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572170#comment-15572170 ] Ryan Blue commented on AVRO-1897: - Is this still a problem? And if so, does it need to block the 1.8.2 release since we have a work-around for it? > clean checkout fails when following BUILD.txt instructions for test > --- > > Key: AVRO-1897 > URL: https://issues.apache.org/jira/browse/AVRO-1897 > Project: Avro > Issue Type: Bug > Components: build, python >Affects Versions: 1.8.1 >Reporter: Sean Busbey >Assignee: Suraj Acharya >Priority: Blocker > Fix For: 1.8.2 > > Attachments: AVRO-1897.patch, AVRO-1897.patch.1 > > > Clean checkout of branch-1.8 run in docker mode fails {{./build.sh test}} on > the python module because it can't find a copy of the avro-tools jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1605) Remove Jackson classes from public API
[ https://issues.apache.org/jira/browse/AVRO-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565892#comment-15565892 ] Ryan Blue commented on AVRO-1605: - Because I don't think accessors are a good practice, I think that we should not add them unless there is a compelling reason and for those we do add we should have a plan for removing them. The current patch adds accessors where I don't think they are necessary. Getting the default value from a field should simply return the default as an Avro object. I don't think there's a performance penalty for that change, but we have benchmarks if we need to make sure that's the case. Another example is Accessor.parseJson. It's only used once and could easily be replaced with a new method: Schema.parseDefaultValue(String, Schema). I think the next steps are to try to remove as many as possible in this patch and have a good reason for the ones that remain. Does that sound reasonable? > Remove Jackson classes from public API > -- > > Key: AVRO-1605 > URL: https://issues.apache.org/jira/browse/AVRO-1605 > Project: Avro > Issue Type: Sub-task > Components: java >Affects Versions: 1.7.8 >Reporter: Tom White >Assignee: Gabor Szadovszky > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1605) Remove Jackson classes from public API
[ https://issues.apache.org/jira/browse/AVRO-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565867#comment-15565867 ] Ryan Blue commented on AVRO-1605: - [~gszadovszky], sorry about not getting back to this before now. > Remove Jackson classes from public API > -- > > Key: AVRO-1605 > URL: https://issues.apache.org/jira/browse/AVRO-1605 > Project: Avro > Issue Type: Sub-task > Components: java >Affects Versions: 1.7.8 >Reporter: Tom White >Assignee: Gabor Szadovszky > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1605) Remove Jackson classes from public API
[ https://issues.apache.org/jira/browse/AVRO-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542684#comment-15542684 ] Ryan Blue commented on AVRO-1605: - My concern isn't that this is public; I see that these classes are in the internal package. But I don't think accessors are a good practice generally and there are definitely uses here that are unnecessary. If this is temporary, then that makes some sense. But if these accessors are going to linger in the code for a while then I don't support adding them as a short-cut. > Remove Jackson classes from public API > -- > > Key: AVRO-1605 > URL: https://issues.apache.org/jira/browse/AVRO-1605 > Project: Avro > Issue Type: Sub-task > Components: java >Affects Versions: 1.7.8 >Reporter: Tom White >Assignee: Gabor Szadovszky > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1861) Avro Schema parser treats Avro float type as Java Double for default values
[ https://issues.apache.org/jira/browse/AVRO-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542660#comment-15542660 ] Ryan Blue commented on AVRO-1861: - Since the Avro type is float, can't we just cast the double to a float? > Avro Schema parser treats Avro float type as Java Double for default values > --- > > Key: AVRO-1861 > URL: https://issues.apache.org/jira/browse/AVRO-1861 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.8.1 >Reporter: Andy Mok >Assignee: Gabor Szadovszky > > The following code snippet in the [Schema > class|https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/Schema.java] > shows that we explicitly treat Avro {{FLOAT}} and {{DOUBLE}} as a Java > {{Double}}. > {code:java} > JsonNode defaultValue = field.get("default"); > if (defaultValue != null > && (Type.FLOAT.equals(fieldSchema.getType()) > || Type.DOUBLE.equals(fieldSchema.getType())) > && defaultValue.isTextual()) > defaultValue = > new DoubleNode(Double.valueOf(defaultValue.getTextValue())); > {code} > Jackson has support for > [FloatNode|https://fasterxml.github.io/jackson-databind/javadoc/2.3.0/com/fasterxml/jackson/databind/node/FloatNode.html] > so why don't we use that? > This is a problem when someone calls > [Schema.Field#defaultVal|https://avro.apache.org/docs/1.8.1/api/java/org/apache/avro/Schema.Field.html#defaultVal()] > for an Avro field with Avro type {{FLOAT}} and they try to typecast the > object to a Java {{float}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1605) Remove Jackson classes from public API
[ https://issues.apache.org/jira/browse/AVRO-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542652#comment-15542652 ] Ryan Blue commented on AVRO-1605: - If I understand correctly, the motivation behind adding the accessors is to avoid needing to rewrite everything to stop using Jackson classes, but still move the functionality outside of the public API. Am I right? If so, then what is the plan to remove them? The `FieldAccessor` can certainly be removed fairly easily and I think the others, like `SchemaAccessor` can as well. I think it's fine to finish this in follow-up commits, but I wouldn't want a release with the accessors because we're just making the problem more complicated and moving it elsewhere. > Remove Jackson classes from public API > -- > > Key: AVRO-1605 > URL: https://issues.apache.org/jira/browse/AVRO-1605 > Project: Avro > Issue Type: Sub-task > Components: java >Affects Versions: 1.7.8 >Reporter: Tom White >Assignee: Gabor Szadovszky > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1812) IDL compiler doesn't support logical types
[ https://issues.apache.org/jira/browse/AVRO-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15537381#comment-15537381 ] Ryan Blue commented on AVRO-1812: - We do have support for types in the IDL, added as part of AVRO-1684. It adds [date, time_ms, and timestamp_ms|https://github.com/apache/avro/pull/86/files#diff-136e900cc327974ae416a44248e47d0aR1488]. I think there was a similar change when decimal was added. Is this was you had in mind? > IDL compiler doesn't support logical types > -- > > Key: AVRO-1812 > URL: https://issues.apache.org/jira/browse/AVRO-1812 > Project: Avro > Issue Type: Improvement > Components: java >Affects Versions: 1.8.0 >Reporter: Dustin Spicuzza >Priority: Minor > > At least, as far as I can see it doesn't support it. We're looking at > migrating to 1.8, and these additional types are a big motivator. > I would advocate adding each one of the new logical types as supported > primitive types in the IDL... if that sounds good to you, I can probably look > into doing it tomorrow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (AVRO-1915) AvroTypeException decoding from earlier schema version
[ https://issues.apache.org/jira/browse/AVRO-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue resolved AVRO-1915. - Resolution: Not A Problem Thanks for letting us know you got past this. I'm going to resolve this issue. > AvroTypeException decoding from earlier schema version > -- > > Key: AVRO-1915 > URL: https://issues.apache.org/jira/browse/AVRO-1915 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.7.7 >Reporter: NPE > > We have two services which communicate with one another by sending > JSON-encoded Avro-based messages over Kafka. We want to update the schema > for messages sent from service A to service B by adding an additional string > field with a default value of "" (empty string). We have tested by initially > adding the updated schema to service B (the reader) and continuing to send > messages in the older format from service A (the writer). > Simplified example of old schema (some fields omitted): > {code} > { > "type": "record", > "name": "Envelope", > "fields": [{ > "name": "appId", > "type": "string" > }, { > "name": "time", > "type": "long" > }, { > "name": "type", > "type": "string" > }, { > "name": "payload", > "type": [{ > "type": "record", > "name": "MessagePayload", > "fields": [{ > "name": "context", > "type": { > "type": "record", > "name": "PayloadContext", > "fields": [{ > "name": "source", > "type": "string" > }, { > "name": "requestId", > "type": "string" > }] > } > }, { > "name": "content", > "type": "string" > }, { > "name": "contentType", > "type": "string" > }] > }] > }] > } > {code} > Simplified example of new schema (some fields omitted): > {code} > { > "type": "record", > "name": "Envelope", > "fields": [{ > "name": "appId", > "type": "string" > }, { > "name": "time", > "type": "long" > }, { > "name": "type", > "type": "string" > }, { > "name": "payload", > "type": [{ > "type": "record", > "name": "MessagePayload", > "fields": [{ > "name": "context", > "type": { > "type": "record", > "name": "PayloadContext", > "fields": [{ > "name": "source", > "type": "string" > }, { > "name": "requestId", > "type": "string" > }, { > "name": "newField", > "type": "string", > "default": "" > }] > } > }, { > "name": "content", > "type": "string" > }, { > "name": "contentType", > "type": "string" > }] > }] > }] > } > {code} > Our understanding was that the reader, with the newer schema, should be able > to parse messages sent with the older given the default value for the missing > field; however, we are getting the following exception: > {code} > org.apache.avro.AvroTypeException: Expected string. Got END_OBJECT > {code} > Are we missing something here? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1554) Avro should have support for common constructs like UUID and Date
[ https://issues.apache.org/jira/browse/AVRO-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1554: Resolution: Fixed Fix Version/s: 1.8.1 Status: Resolved (was: Patch Available) I agree, this is done now that there are logical types for both date and UUID. > Avro should have support for common constructs like UUID and Date > - > > Key: AVRO-1554 > URL: https://issues.apache.org/jira/browse/AVRO-1554 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.7.6 >Reporter: Sachin Goyal > Fix For: 1.8.1 > > Attachments: AVRO-1554.patch, AVRO-1554_2.patch, AVRO-1554_3.patch, > CustomEncodingUnionBug.zip > > > Consider the following code: > {code} > public class AvroExample > { > public static void main (String [] args) throws Exception > { > ReflectData rdata = ReflectData.AllowNull.get(); > Schema schema = rdata.getSchema(Temp.class); > > ReflectDatumWriter datumWriter = >new ReflectDatumWriter (Temp.class, rdata); > DataFileWriter fileWriter = >new DataFileWriter (datumWriter); > ByteArrayOutputStream baos = new ByteArrayOutputStream(); > fileWriter.create(schema, baos); > fileWriter.append(new Temp()); > fileWriter.close(); > byte[] bytes = baos.toByteArray(); > GenericDatumReader datumReader = > new GenericDatumReader (); > SeekableByteArrayInput avroInputStream = > new SeekableByteArrayInput(bytes); > DataFileReader fileReader = > new DataFileReader(avroInputStream, > datumReader); > schema = fileReader.getSchema(); > GenericRecord record = null; > record = fileReader.next(record); > System.out.println (record); > System.out.println (record.get("id")); > } > } > class Temp > { > UUID id = UUID.randomUUID(); > Date date = new Date(); > BigInteger bi = BigInteger.TEN; > } > {code} > Output from this code is: > {code:javascript} > {"id": {}, "date": {}, "bi": "10"} > {code} > UUID and Date type fields are very common in Java and can be found a lot in > third-party code as well (where it may be difficult to put annotations). > So Avro should include a default serialization/deserialization support for > such fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1047) Generated Java classes for specific records contain unchecked casts
[ https://issues.apache.org/jira/browse/AVRO-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497026#comment-15497026 ] Ryan Blue commented on AVRO-1047: - [~nielsbasjes], can you check whether this is the same thing that you fixed in AVRO-1913? > Generated Java classes for specific records contain unchecked casts > --- > > Key: AVRO-1047 > URL: https://issues.apache.org/jira/browse/AVRO-1047 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.6.3 >Reporter: Garrett Wu > Attachments: AVRO-1047.patch, suppress-warnings.tar.gz > > > The generated Java classes for specific records cause compiler warnings using > Oracle/Sun Java 1.6, since it doesn't support @SuppressWarnings("all"). > Instead could we change it to @SuppressWarnings("unchecked")? Only > "unchecked" and "deprecation" are mentioned Java Language Specification -- > the rest are specific to compiler vendors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1684) Add date, time, and timestamp to specific object model classes
[ https://issues.apache.org/jira/browse/AVRO-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497024#comment-15497024 ] Ryan Blue commented on AVRO-1684: - Yeah, good idea. Thanks! > Add date, time, and timestamp to specific object model classes > -- > > Key: AVRO-1684 > URL: https://issues.apache.org/jira/browse/AVRO-1684 > Project: Avro > Issue Type: New Feature > Components: java >Affects Versions: 1.7.7 >Reporter: Ryan Blue >Assignee: Ryan Blue > Fix For: 1.8.1 > > Attachments: AVRO-1684.1.patch > > > AVRO-1672 adds conversions for date, time, and timestamp. These should be > available to specific classes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1916) Building python version uses wrong version avro-tools.
[ https://issues.apache.org/jira/browse/AVRO-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497013#comment-15497013 ] Ryan Blue commented on AVRO-1916: - Can we set this separately to use the last release of avro-tools? The functionality doesn't change much. I don't think this is a blocker, just an annoyance in the build process. > Building python version uses wrong version avro-tools. > -- > > Key: AVRO-1916 > URL: https://issues.apache.org/jira/browse/AVRO-1916 > Project: Avro > Issue Type: Bug >Reporter: Niels Basjes > > During {{./build.sh test}} I see this during the build of {{lang/py}} > {code} > [ivy:retrieve]found org.apache.avro#avro-tools;1.9.0-SNAPSHOT in > apache-snapshots > [ivy:retrieve] downloading > https://repository.apache.org/content/groups/snapshots/org/apache/avro/avro-tools/1.9.0-SNAPSHOT/avro-tools-1.9.0-20160122.173016-35.jar > ... > {code} > So apparently the py build phase uses an external version of avro-tools. > What if I just updated avro-tools? Then it is quite possible the test will > pass while in reality it should have failed. > I suspect the fix can be as simple as doing a {{mvn install}} on the java > avro-tools before building/testing the rest of the languages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1885) Release 1.8.2
[ https://issues.apache.org/jira/browse/AVRO-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491032#comment-15491032 ] Ryan Blue commented on AVRO-1885: - I reviewed the two PRs you posted. They look good to me, thanks for taking the time to fix those. > Release 1.8.2 > - > > Key: AVRO-1885 > URL: https://issues.apache.org/jira/browse/AVRO-1885 > Project: Avro > Issue Type: Task > Components: community >Affects Versions: 1.8.2 >Reporter: Sean Busbey >Assignee: Sean Busbey > Fix For: 1.8.2 > > > Please link to any issues that should be considered blockers for the 1.8.2 > release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1915) AvroTypeException decoding from earlier schema version
[ https://issues.apache.org/jira/browse/AVRO-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491022#comment-15491022 ] Ryan Blue commented on AVRO-1915: - These schemas do look compatible to me. Usually when you have trouble reading in these cases, the problem is that you haven't passed both the writer's schema and the reader's schema to the datum reader. The writer's schema is needed to know what fields to expect. When you construct your datum reader, pass both schemas, like this: {code:lang=java} DatumReader reader = GenericData.get().createDatumReader(writerSchema, readerSchema); {code} Also, the 1.8.2 release is coming up and we've added some support for this use case. In the upcoming release, you'll be able to do this: {code:lang=java} MessageEncoder v1Encoder = new BinaryMessageEncoder(GenericData.get(), SCHEMA_V1); BinaryMessageDecoder v2Decoder = new BinaryMessageDecoder(GenericData.get(), SCHEMA_V2); // add the older version to the decoder so it can handle both v1 and v2 messages (you can do this with as many as you need) v2Decoder.addSchema(SCHEMA_V1); // encode the v1 record (on the producer side) ByteBuffer v1Buffer = v1Encoder.encode(v1record); // decode the v1 record to the expected v2 schema (on the consumer side) Record v2record = v2Decoder.decode(v1Buffer); {code} > AvroTypeException decoding from earlier schema version > -- > > Key: AVRO-1915 > URL: https://issues.apache.org/jira/browse/AVRO-1915 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.7.7 >Reporter: NPE > > We have two services which communicate with one another by sending > JSON-encoded Avro-based messages over Kafka. We want to update the schema > for messages sent from service A to service B by adding an additional string > field with a default value of "" (empty string). We have tested by initially > adding the updated schema to service B (the reader) and continuing to send > messages in the older format from service A (the writer). > Simplified example of old schema (some fields omitted): > {code} > { > "type": "record", > "name": "Envelope", > "fields": [{ > "name": "appId", > "type": "string" > }, { > "name": "time", > "type": "long" > }, { > "name": "type", > "type": "string" > }, { > "name": "payload", > "type": [{ > "type": "record", > "name": "MessagePayload", > "fields": [{ > "name": "context", > "type": { > "type": "record", > "name": "PayloadContext", > "fields": [{ > "name": "source", > "type": "string" > }, { > "name": "requestId", > "type": "string" > }] > } > }, { > "name": "content", > "type": "string" > }, { > "name": "contentType", > "type": "string" > }] > }] > }] > } > {code} > Simplified example of new schema (some fields omitted): > {code} > { > "type": "record", > "name": "Envelope", > "fields": [{ > "name": "appId", > "type": "string" > }, { > "name": "time", > "type": "long" > }, { > "name": "type", > "type": "string" > }, { > "name": "payload", > "type": [{ > "type": "record", > "name": "MessagePayload", > "fields": [{ > "name": "context", > "type": { > "type": "record", > "name": "PayloadContext", > "fields": [{ > "name": "source", > "type": "string" > }, { > "name": "requestId", > "type": "string" > }, { >
[jira] [Updated] (AVRO-1900) build.sh test fails for dev-tools.jar
[ https://issues.apache.org/jira/browse/AVRO-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1900: Resolution: Fixed Assignee: Ryan Blue (was: Suraj Acharya) Fix Version/s: 1.8.2 Status: Resolved (was: Patch Available) Merged. Thanks for reviewing, Sean! > build.sh test fails for dev-tools.jar > - > > Key: AVRO-1900 > URL: https://issues.apache.org/jira/browse/AVRO-1900 > Project: Avro > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Suraj Acharya >Assignee: Ryan Blue > Labels: build > Fix For: 1.8.2 > > Attachments: AVRO-1900.patch > > > When i ran {{./build.sh test}} in the docker container I was getting an error > mentioning dev-tools.jar was not present. > When I looked further into it I realized that {{build.sh}} never actually > builds dev-tools.jar. > I added a line in the test option to first {{mvn install}} on the dev-tools > folder. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1910) Avro HttpTransceiver.cs - throw new Exception(string.Format("Unexpected end of response binary stream - expected {0} more bytes in current chunk", (object)count));
[ https://issues.apache.org/jira/browse/AVRO-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1910: Assignee: Jeremy Custenborder > Avro HttpTransceiver.cs - throw new Exception(string.Format("Unexpected end > of response binary stream - expected {0} more bytes in current chunk", > (object)count)); > > > Key: AVRO-1910 > URL: https://issues.apache.org/jira/browse/AVRO-1910 > Project: Avro > Issue Type: Bug > Components: csharp >Reporter: Arthur Coquelet >Assignee: Jeremy Custenborder > > in Avro HttpTransceiver.cs - > I basically get this exception: > throw new Exception(string.Format("Unexpected end of response binary stream - > expected {0} more bytes in current chunk", (object)count)); > I have in my Excel which use my addin: > Unexpected end of response binary stream - expected 1154463041 more bytes in > current chunk > It appears while transferring data from the server to the client. It lots of > data but it does not reach the limit. > Could you please advise of how to deal with this exception? > Thanks in advance. > Best regards, > Arthur Coquelet -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1843) Clarify importance of writer's schema in documentation
[ https://issues.apache.org/jira/browse/AVRO-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15484667#comment-15484667 ] Ryan Blue commented on AVRO-1843: - I'm removing this as a blocker for 1.8.2. It's about ready and I'll commit it if it's done in time, but I don't think we should hold 1.8.2 on this. > Clarify importance of writer's schema in documentation > -- > > Key: AVRO-1843 > URL: https://issues.apache.org/jira/browse/AVRO-1843 > Project: Avro > Issue Type: Improvement > Components: doc >Reporter: Shannon Carey >Priority: Critical > Fix For: 1.9.0 > > > I'll be submitting a PR with some improvements to the Java Getting Started > page as well as the Specification which make it clearer that Avro must read > all data with the writer's schema before converting it into the reader's > schema and why, and explaining that's why the schema should be available next > to serialized data. Currently, it's arguably too easy to misinterpret Avro as > only requiring a single, reader's schema in order to read data while still > following the resolution rules which make Avro seem similar to JSON > (resolution by field name). For example, the Java API examples only appear to > involve one schema, hiding the fact that it reads in the writer's schema > implicitly. Also, the ability to serialize to JSON (where field names and > some type info is present) makes this misconception easy to believe. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1843) Clarify importance of writer's schema in documentation
[ https://issues.apache.org/jira/browse/AVRO-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1843: Fix Version/s: (was: 1.8.2) (was: 1.7.8) > Clarify importance of writer's schema in documentation > -- > > Key: AVRO-1843 > URL: https://issues.apache.org/jira/browse/AVRO-1843 > Project: Avro > Issue Type: Improvement > Components: doc >Reporter: Shannon Carey >Priority: Critical > Fix For: 1.9.0 > > > I'll be submitting a PR with some improvements to the Java Getting Started > page as well as the Specification which make it clearer that Avro must read > all data with the writer's schema before converting it into the reader's > schema and why, and explaining that's why the schema should be available next > to serialized data. Currently, it's arguably too easy to misinterpret Avro as > only requiring a single, reader's schema in order to read data while still > following the resolution rules which make Avro seem similar to JSON > (resolution by field name). For example, the Java API examples only appear to > involve one schema, hiding the fact that it reads in the writer's schema > implicitly. Also, the ability to serialize to JSON (where field names and > some type info is present) makes this misconception easy to believe. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1873) avro gem doesn't compatible with other languages with snappy compression
[ https://issues.apache.org/jira/browse/AVRO-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1873: Resolution: Fixed Status: Resolved (was: Patch Available) > avro gem doesn't compatible with other languages with snappy compression > > > Key: AVRO-1873 > URL: https://issues.apache.org/jira/browse/AVRO-1873 > Project: Avro > Issue Type: Bug > Components: ruby >Affects Versions: 1.8.1 > Environment: CentOS 6.8 64bit, Snappy 1.1.0, Python 3.5, Ruby 2.2.3 >Reporter: Pumsuk Cho >Assignee: Ryan Blue >Priority: Blocker > Fix For: 1.8.2 > > > I've tested avro gem today, then found some weird result. > With python library like "fastavro", generated an avro file snappy > compressed. This file works fine with avro-tools-1.8.1.jar. > java -jar avro-tools-1.8.1.jar tojson testing.avro returns what I expected. > But NOT compatible with ruby using avro gem returns "Invalid Input" message. > And snappy compressed avro file made with avro gem doesn't work with > avro-tools nor in python with avro-python3 and fastavro. > my ruby codes are below: > schema = Avro::Schema.paese(File.open('test.avsc', 'r').read) > avrofile = File.open('test.avro', 'wb') > writer = Avro::IO::DatumWriter.new(schema) > datawriter = Avro::DataFile::Writer.new file, writer, schema, 'snappy' > datawriter<< {"title" => "Avro", "author" => "Apache Foundation"} > datawriter.close -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1912) C++ Resolving Decoding doesn't work if element removed from record in array.
[ https://issues.apache.org/jira/browse/AVRO-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15484426#comment-15484426 ] Ryan Blue commented on AVRO-1912: - [~thiru_mg], we're going to be making a 1.8.2 RC soon, so please merge this if you think it's ready. Thanks! > C++ Resolving Decoding doesn't work if element removed from record in array. > > > Key: AVRO-1912 > URL: https://issues.apache.org/jira/browse/AVRO-1912 > Project: Avro > Issue Type: Bug >Reporter: John McClean > Attachments: AVRO-1912.patch > > > Writer schema: > {code} > { > "type": "record", > "name": "TestRecord", > "fields": [ > { > "name": "array", > "type": { > "type": "array", > "items": { > "name": "item", > "type": "record", > "fields": [ > { "name": "A", "type": "string" }, > { "name": "B", "type": "string", "default": "foo" } > ] > } > } > } > ] > } > {code} > Reader schema: > {code} > { > "type": "record", > "name": "TestRecord", > "fields": [ > { > "name": "array", > "type": { > "type": "array", > "items": { > "name": "item", > "type": "record", > "fields": [ > { "name": "A", "type": "string" } > ] > } > } > } > ] > } > {code} > Data is: > {code} > { > "array": [ > { > "A": "", > "B": "" > } > ] > } > {code} > The following code fails with an exception “Expected: Repeater got String”. > The equivalent java code works fine on the same schema and data. > {code} > auto decoder = avro::resolvingDecoder(writerSchema, > readerSchema, > avro::jsonDecoder(writerSchema)); > strinstream ss = loadData(); > auto_ptr in = avro::istreamInputStream(ss); > decoder->init(*in); > auto record = reader::TestRecord(); > decode(*decoder, record); > {code} > I stepped through the code and what seems to be happening is that the code is > treating “A” and “B” as distinct elements in the array, as if the array had > two elements rather than one. > I'm not sure how to go about fixing this. Any pointers would be appreciated. > (I don't think it's my C++ test code. It works fine if the record above isn't > in an array.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1885) Release 1.8.2
[ https://issues.apache.org/jira/browse/AVRO-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15482386#comment-15482386 ] Ryan Blue commented on AVRO-1885: - I don't think you mean AVRO-1887. Is there a different one? > Release 1.8.2 > - > > Key: AVRO-1885 > URL: https://issues.apache.org/jira/browse/AVRO-1885 > Project: Avro > Issue Type: Task > Components: community >Affects Versions: 1.8.2 >Reporter: Sean Busbey >Assignee: Sean Busbey > Fix For: 1.8.2 > > > Please link to any issues that should be considered blockers for the 1.8.2 > release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1885) Release 1.8.2
[ https://issues.apache.org/jira/browse/AVRO-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15480739#comment-15480739 ] Ryan Blue commented on AVRO-1885: - I think the remaining 3 issues (excluding AVRO-1811 and AVRO-1843) are ready to be committed and just need final reviews. Then we should be ready for an RC since AVRO-1900 fixes the build blocker. > Release 1.8.2 > - > > Key: AVRO-1885 > URL: https://issues.apache.org/jira/browse/AVRO-1885 > Project: Avro > Issue Type: Task > Components: community >Affects Versions: 1.8.2 >Reporter: Sean Busbey >Assignee: Sean Busbey > Fix For: 1.8.2 > > > Please link to any issues that should be considered blockers for the 1.8.2 > release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1885) Release 1.8.2
[ https://issues.apache.org/jira/browse/AVRO-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15480736#comment-15480736 ] Ryan Blue commented on AVRO-1885: - In the interest of getting the RC out, I propose we remove AVRO-1811 and AVRO-1843 from the list of blockers. AVRO-1811 has a patch, but it isn't complete yet and this has been a bug for a long time without being caught so I think the priority is pretty low. AVRO-1843 is also low priority and is close but doesn't need to hold up the release. If they make it in time, that's great but I don't think they need to be blockers. Anyone else have an opinion? > Release 1.8.2 > - > > Key: AVRO-1885 > URL: https://issues.apache.org/jira/browse/AVRO-1885 > Project: Avro > Issue Type: Task > Components: community >Affects Versions: 1.8.2 >Reporter: Sean Busbey >Assignee: Sean Busbey > Fix For: 1.8.2 > > > Please link to any issues that should be considered blockers for the 1.8.2 > release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1900) build.sh test fails for dev-tools.jar
[ https://issues.apache.org/jira/browse/AVRO-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15480653#comment-15480653 ] Ryan Blue commented on AVRO-1900: - I posted a fix that moves checkstyle into the lang/java build: https://github.com/apache/avro/pull/122 > build.sh test fails for dev-tools.jar > - > > Key: AVRO-1900 > URL: https://issues.apache.org/jira/browse/AVRO-1900 > Project: Avro > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Suraj Acharya >Assignee: Suraj Acharya > Labels: build > Attachments: AVRO-1900.patch > > > When i ran {{./build.sh test}} in the docker container I was getting an error > mentioning dev-tools.jar was not present. > When I looked further into it I realized that {{build.sh}} never actually > builds dev-tools.jar. > I added a line in the test option to first {{mvn install}} on the dev-tools > folder. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1900) build.sh test fails for dev-tools.jar
[ https://issues.apache.org/jira/browse/AVRO-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15480642#comment-15480642 ] Ryan Blue commented on AVRO-1900: - I think the right way to fix this is to get rid of dev-tools. I flagged this on AVRO-1838, but didn't know that it would mess up the build script or wouldn't have committed it at the time. That issue was primarily about improving our checkstyle config anyway. Moving this back into lang/java/ will fix the problem so we don't need to document it for new Java devs while fixing the build. > build.sh test fails for dev-tools.jar > - > > Key: AVRO-1900 > URL: https://issues.apache.org/jira/browse/AVRO-1900 > Project: Avro > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Suraj Acharya >Assignee: Suraj Acharya > Labels: build > Attachments: AVRO-1900.patch > > > When i ran {{./build.sh test}} in the docker container I was getting an error > mentioning dev-tools.jar was not present. > When I looked further into it I realized that {{build.sh}} never actually > builds dev-tools.jar. > I added a line in the test option to first {{mvn install}} on the dev-tools > folder. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1873) avro gem doesn't compatible with other languages with snappy compression
[ https://issues.apache.org/jira/browse/AVRO-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15480604#comment-15480604 ] Ryan Blue commented on AVRO-1873: - I wrote the same content from Java and from Ruby and hexdumped the result. The problem was that the last 4 bytes were missing from the ruby payload, but the rest of the Snappy-encoded data looked identical. From looking at [Java's SnappyCodec|https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/file/SnappyCodec.java], it looks like those last 4 bytes are a CRC32 checksum. Adding the checksum (using Zlib.crc32) fixed compatibility and made it so Avro blocks written by Java and Ruby are identical. For the read path, I implemented the check but the code doesn't throw an error if the checksum doesn't match. Instead, it assumes that it is reading an older Ruby file and decompresses the entire incoming buffer and passes the result along. I don't think there's a way to both validate the checksum and detect old files, so this seems reasonable to me. > avro gem doesn't compatible with other languages with snappy compression > > > Key: AVRO-1873 > URL: https://issues.apache.org/jira/browse/AVRO-1873 > Project: Avro > Issue Type: Bug > Components: ruby >Affects Versions: 1.8.1 > Environment: CentOS 6.8 64bit, Snappy 1.1.0, Python 3.5, Ruby 2.2.3 >Reporter: Pumsuk Cho >Priority: Blocker > Fix For: 1.8.2 > > > I've tested avro gem today, then found some weird result. > With python library like "fastavro", generated an avro file snappy > compressed. This file works fine with avro-tools-1.8.1.jar. > java -jar avro-tools-1.8.1.jar tojson testing.avro returns what I expected. > But NOT compatible with ruby using avro gem returns "Invalid Input" message. > And snappy compressed avro file made with avro gem doesn't work with > avro-tools nor in python with avro-python3 and fastavro. > my ruby codes are below: > schema = Avro::Schema.paese(File.open('test.avsc', 'r').read) > avrofile = File.open('test.avro', 'wb') > writer = Avro::IO::DatumWriter.new(schema) > datawriter = Avro::DataFile::Writer.new file, writer, schema, 'snappy' > datawriter<< {"title" => "Avro", "author" => "Apache Foundation"} > datawriter.close -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1873) avro gem doesn't compatible with other languages with snappy compression
[ https://issues.apache.org/jira/browse/AVRO-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1873: Assignee: Ryan Blue Status: Patch Available (was: Open) > avro gem doesn't compatible with other languages with snappy compression > > > Key: AVRO-1873 > URL: https://issues.apache.org/jira/browse/AVRO-1873 > Project: Avro > Issue Type: Bug > Components: ruby >Affects Versions: 1.8.1 > Environment: CentOS 6.8 64bit, Snappy 1.1.0, Python 3.5, Ruby 2.2.3 >Reporter: Pumsuk Cho >Assignee: Ryan Blue >Priority: Blocker > Fix For: 1.8.2 > > > I've tested avro gem today, then found some weird result. > With python library like "fastavro", generated an avro file snappy > compressed. This file works fine with avro-tools-1.8.1.jar. > java -jar avro-tools-1.8.1.jar tojson testing.avro returns what I expected. > But NOT compatible with ruby using avro gem returns "Invalid Input" message. > And snappy compressed avro file made with avro gem doesn't work with > avro-tools nor in python with avro-python3 and fastavro. > my ruby codes are below: > schema = Avro::Schema.paese(File.open('test.avsc', 'r').read) > avrofile = File.open('test.avro', 'wb') > writer = Avro::IO::DatumWriter.new(schema) > datawriter = Avro::DataFile::Writer.new file, writer, schema, 'snappy' > datawriter<< {"title" => "Avro", "author" => "Apache Foundation"} > datawriter.close -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (AVRO-1908) IPC TestSpecificCompiler build is broken
[ https://issues.apache.org/jira/browse/AVRO-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue resolved AVRO-1908. - Resolution: Fixed Fix Version/s: 1.8.2 1.9.0 > IPC TestSpecificCompiler build is broken > > > Key: AVRO-1908 > URL: https://issues.apache.org/jira/browse/AVRO-1908 > Project: Avro > Issue Type: Bug > Components: java >Reporter: Ryan Blue >Assignee: Ryan Blue > Fix For: 1.9.0, 1.8.2 > > > AVRO-1884 changed {{SpecificCompiler.makePath}} to {{private > SpecificCompiler#makePath}}, which broke building the tests in the IPC > package. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (AVRO-1908) IPC TestSpecificCompiler build is broken
[ https://issues.apache.org/jira/browse/AVRO-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue reassigned AVRO-1908: --- Assignee: Ryan Blue > IPC TestSpecificCompiler build is broken > > > Key: AVRO-1908 > URL: https://issues.apache.org/jira/browse/AVRO-1908 > Project: Avro > Issue Type: Bug > Components: java >Reporter: Ryan Blue >Assignee: Ryan Blue > Fix For: 1.9.0, 1.8.2 > > > AVRO-1884 changed {{SpecificCompiler.makePath}} to {{private > SpecificCompiler#makePath}}, which broke building the tests in the IPC > package. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1910) Avro HttpTransceiver.cs - throw new Exception(string.Format("Unexpected end of response binary stream - expected {0} more bytes in current chunk", (object)count));
[ https://issues.apache.org/jira/browse/AVRO-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474189#comment-15474189 ] Ryan Blue commented on AVRO-1910: - [~jcustenborder], can you have a look? > Avro HttpTransceiver.cs - throw new Exception(string.Format("Unexpected end > of response binary stream - expected {0} more bytes in current chunk", > (object)count)); > > > Key: AVRO-1910 > URL: https://issues.apache.org/jira/browse/AVRO-1910 > Project: Avro > Issue Type: Bug > Components: csharp >Reporter: Arthur Coquelet > > in Avro HttpTransceiver.cs - > I basically get this exception: > throw new Exception(string.Format("Unexpected end of response binary stream - > expected {0} more bytes in current chunk", (object)count)); > I have in my Excel which use my addin: > Unexpected end of response binary stream - expected 1154463041 more bytes in > current chunk > It appears while transferring data from the server to the client. It lots of > data but it does not reach the limit. > Could you please advise of how to deal with this exception? > Thanks in advance. > Best regards, > Arthur Coquelet -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1891) Generated Java code fails with union containing logical type
[ https://issues.apache.org/jira/browse/AVRO-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467900#comment-15467900 ] Ryan Blue commented on AVRO-1891: - Much the same way, by keeping track of the conversions for the record currently being written or delegating to the data model's types for generic and reflect. When I get time, I'll finish the write side and add tests (though if anyone else feels like continuing to get it out quicker, I'm fine with that). > Generated Java code fails with union containing logical type > > > Key: AVRO-1891 > URL: https://issues.apache.org/jira/browse/AVRO-1891 > Project: Avro > Issue Type: Bug > Components: java, logical types >Affects Versions: 1.8.1 >Reporter: Ross Black >Priority: Blocker > Fix For: 1.8.3 > > Attachments: AVRO-1891.patch, AVRO-1891.yshi.1.patch, > AVRO-1891.yshi.2.patch, AVRO-1891.yshi.3.patch, AVRO-1891.yshi.4.patch > > > Example schema: > {code} > { > "type": "record", > "name": "RecordV1", > "namespace": "org.brasslock.event", > "fields": [ > { "name": "first", "type": ["null", {"type": "long", > "logicalType":"timestamp-millis"}]} > ] > } > {code} > The avro compiler generates a field using the relevant joda class: > {code} > public org.joda.time.DateTime first > {code} > Running the following code to perform encoding: > {code} > final RecordV1 record = new > RecordV1(DateTime.parse("2016-07-29T10:15:30.00Z")); > final DatumWriter datumWriter = new > SpecificDatumWriter<>(record.getSchema()); > final ByteArrayOutputStream stream = new ByteArrayOutputStream(8192); > final BinaryEncoder encoder = > EncoderFactory.get().directBinaryEncoder(stream, null); > datumWriter.write(record, encoder); > encoder.flush(); > final byte[] bytes = stream.toByteArray(); > {code} > fails with the exception stacktrace: > {code} > org.apache.avro.AvroRuntimeException: Unknown datum type > org.joda.time.DateTime: 2016-07-29T10:15:30.000Z > at org.apache.avro.generic.GenericData.getSchemaName(GenericData.java:741) > at > org.apache.avro.specific.SpecificData.getSchemaName(SpecificData.java:293) > at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:706) > at > org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:192) > at > org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:110) > at > org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:87) > at > org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143) > at > org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60) > at > org.brasslock.avro.compiler.GeneratedRecordTest.shouldEncodeLogicalTypeInUnion(GeneratedRecordTest.java:82) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:117) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:42) > at >
[jira] [Commented] (AVRO-1891) Generated Java code fails with union containing logical type
[ https://issues.apache.org/jira/browse/AVRO-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15465679#comment-15465679 ] Ryan Blue commented on AVRO-1891: - I've posted a PR with the proposed changes. It still needs tests to validate it works for maps, arrays, and unions, but it demonstrates the idea. > Generated Java code fails with union containing logical type > > > Key: AVRO-1891 > URL: https://issues.apache.org/jira/browse/AVRO-1891 > Project: Avro > Issue Type: Bug > Components: java, logical types >Affects Versions: 1.8.1 >Reporter: Ross Black >Priority: Blocker > Fix For: 1.8.3 > > Attachments: AVRO-1891.patch, AVRO-1891.yshi.1.patch, > AVRO-1891.yshi.2.patch, AVRO-1891.yshi.3.patch, AVRO-1891.yshi.4.patch > > > Example schema: > {code} > { > "type": "record", > "name": "RecordV1", > "namespace": "org.brasslock.event", > "fields": [ > { "name": "first", "type": ["null", {"type": "long", > "logicalType":"timestamp-millis"}]} > ] > } > {code} > The avro compiler generates a field using the relevant joda class: > {code} > public org.joda.time.DateTime first > {code} > Running the following code to perform encoding: > {code} > final RecordV1 record = new > RecordV1(DateTime.parse("2016-07-29T10:15:30.00Z")); > final DatumWriter datumWriter = new > SpecificDatumWriter<>(record.getSchema()); > final ByteArrayOutputStream stream = new ByteArrayOutputStream(8192); > final BinaryEncoder encoder = > EncoderFactory.get().directBinaryEncoder(stream, null); > datumWriter.write(record, encoder); > encoder.flush(); > final byte[] bytes = stream.toByteArray(); > {code} > fails with the exception stacktrace: > {code} > org.apache.avro.AvroRuntimeException: Unknown datum type > org.joda.time.DateTime: 2016-07-29T10:15:30.000Z > at org.apache.avro.generic.GenericData.getSchemaName(GenericData.java:741) > at > org.apache.avro.specific.SpecificData.getSchemaName(SpecificData.java:293) > at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:706) > at > org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:192) > at > org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:110) > at > org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:87) > at > org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143) > at > org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60) > at > org.brasslock.avro.compiler.GeneratedRecordTest.shouldEncodeLogicalTypeInUnion(GeneratedRecordTest.java:82) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:117) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:42) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:253) > at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:84) > at
[jira] [Comment Edited] (AVRO-1891) Generated Java code fails with union containing logical type
[ https://issues.apache.org/jira/browse/AVRO-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15463687#comment-15463687 ] Ryan Blue edited comment on AVRO-1891 at 9/5/16 4:42 PM: - I think all this requires is keeping a set of conversions that should be applied when reading or writing a specific class. Unlike generic where the conversions are determined by the data model at runtime, the conversions that should be used for a specific class are determined at compile time. We have the benefit of knowing that the compiler either added conversions for all instances of a logical type, or for none of them. So we only need to know the set of conversions the compiler had set up when a class was compiled. Rather than relying on the set of conversions the SpecificData instance has configured, I think we should keep the set of conversions for the class being written or read. So we don't need to change how SpecificData looks up conversions, just the way the SpecificDatumReader/Writer does to avoid looking them up in the data model. (I agree with Doug and don't see an advantage of adding a conversion resolver.) What about this: * Maintain a thread-local reference to the current specific record class in SpecificDatumReader * Add a static conversion map to each specific class with its conversions (generated code) * Add conversion lookup methods to GenericDatumReader that delegate to GenericData * Override the conversion lookup methods in SpecificDatumReader that use the current record class's set of conversions instead. This way, there are no changes to how the data model lookups work, little generated code (just an annotation to conversion map), and few changes to the datum reader and writers. What do you guys think? I think this would be a bit smaller patch. I'll try to put it together tomorrow if I have time. was (Author: rdblue): I think all this requires is keeping a set of conversions that should be applied when reading or writing a specific class. Unlike generic where the conversions are determined by the data model at runtime, the conversions that should be used for a specific class are determined at compile time. We have the benefit of knowing that the compiler either added conversions for all instances of a logical type, or for none of them. So we only need to know the set of conversions the compiler had set up when a class was compiled. Rather than relying on the set of conversions the SpecificData instance has configured, I think we should keep the set of conversions for the class being written or read. So we don't need to change how SpecificData looks up conversions, just the way the SpecificDatumReader/Writer does to avoid looking them up in the data model. (I agree with Doug and don't see an advantage of adding a conversion resolver.) What about this: * Maintain a thread-local reference to the current specific record class in GenericDatumReader * Add a static conversion map to each specific class with its conversions (generated code) * Add conversion lookup methods to GenericDatumReader that delegate to GenericData * Override the conversion lookup methods in SpecificDatumReader that use the current record class's set of conversions instead. This way, there are no changes to how the data model lookups work, little generated code (just an annotation to conversion map), and few changes to the datum reader and writers. What do you guys think? I think this would be a bit smaller patch. I'll try to put it together tomorrow if I have time. > Generated Java code fails with union containing logical type > > > Key: AVRO-1891 > URL: https://issues.apache.org/jira/browse/AVRO-1891 > Project: Avro > Issue Type: Bug > Components: java, logical types >Affects Versions: 1.8.1 >Reporter: Ross Black >Priority: Blocker > Fix For: 1.8.3 > > Attachments: AVRO-1891.patch, AVRO-1891.yshi.1.patch, > AVRO-1891.yshi.2.patch, AVRO-1891.yshi.3.patch, AVRO-1891.yshi.4.patch > > > Example schema: > {code} > { > "type": "record", > "name": "RecordV1", > "namespace": "org.brasslock.event", > "fields": [ > { "name": "first", "type": ["null", {"type": "long", > "logicalType":"timestamp-millis"}]} > ] > } > {code} > The avro compiler generates a field using the relevant joda class: > {code} > public org.joda.time.DateTime first > {code} > Running the following code to perform encoding: > {code} > final RecordV1 record = new > RecordV1(DateTime.parse("2016-07-29T10:15:30.00Z")); > final DatumWriter datumWriter = new > SpecificDatumWriter<>(record.getSchema()); > final ByteArrayOutputStream stream = new ByteArrayOutputStream(8192); > final BinaryEncoder
[jira] [Commented] (AVRO-1891) Generated Java code fails with union containing logical type
[ https://issues.apache.org/jira/browse/AVRO-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15463687#comment-15463687 ] Ryan Blue commented on AVRO-1891: - I think all this requires is keeping a set of conversions that should be applied when reading or writing a specific class. Unlike generic where the conversions are determined by the data model at runtime, the conversions that should be used for a specific class are determined at compile time. We have the benefit of knowing that the compiler either added conversions for all instances of a logical type, or for none of them. So we only need to know the set of conversions the compiler had set up when a class was compiled. Rather than relying on the set of conversions the SpecificData instance has configured, I think we should keep the set of conversions for the class being written or read. So we don't need to change how SpecificData looks up conversions, just the way the SpecificDatumReader/Writer does to avoid looking them up in the data model. (I agree with Doug and don't see an advantage of adding a conversion resolver.) What about this: * Maintain a thread-local reference to the current specific record class in GenericDatumReader * Add a static conversion map to each specific class with its conversions (generated code) * Add conversion lookup methods to GenericDatumReader that delegate to GenericData * Override the conversion lookup methods in SpecificDatumReader that use the current record class's set of conversions instead. This way, there are no changes to how the data model lookups work, little generated code (just an annotation to conversion map), and few changes to the datum reader and writers. What do you guys think? I think this would be a bit smaller patch. I'll try to put it together tomorrow if I have time. > Generated Java code fails with union containing logical type > > > Key: AVRO-1891 > URL: https://issues.apache.org/jira/browse/AVRO-1891 > Project: Avro > Issue Type: Bug > Components: java, logical types >Affects Versions: 1.8.1 >Reporter: Ross Black >Priority: Blocker > Fix For: 1.8.3 > > Attachments: AVRO-1891.patch, AVRO-1891.yshi.1.patch, > AVRO-1891.yshi.2.patch, AVRO-1891.yshi.3.patch, AVRO-1891.yshi.4.patch > > > Example schema: > {code} > { > "type": "record", > "name": "RecordV1", > "namespace": "org.brasslock.event", > "fields": [ > { "name": "first", "type": ["null", {"type": "long", > "logicalType":"timestamp-millis"}]} > ] > } > {code} > The avro compiler generates a field using the relevant joda class: > {code} > public org.joda.time.DateTime first > {code} > Running the following code to perform encoding: > {code} > final RecordV1 record = new > RecordV1(DateTime.parse("2016-07-29T10:15:30.00Z")); > final DatumWriter datumWriter = new > SpecificDatumWriter<>(record.getSchema()); > final ByteArrayOutputStream stream = new ByteArrayOutputStream(8192); > final BinaryEncoder encoder = > EncoderFactory.get().directBinaryEncoder(stream, null); > datumWriter.write(record, encoder); > encoder.flush(); > final byte[] bytes = stream.toByteArray(); > {code} > fails with the exception stacktrace: > {code} > org.apache.avro.AvroRuntimeException: Unknown datum type > org.joda.time.DateTime: 2016-07-29T10:15:30.000Z > at org.apache.avro.generic.GenericData.getSchemaName(GenericData.java:741) > at > org.apache.avro.specific.SpecificData.getSchemaName(SpecificData.java:293) > at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:706) > at > org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:192) > at > org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:110) > at > org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:87) > at > org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143) > at > org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60) > at > org.brasslock.avro.compiler.GeneratedRecordTest.shouldEncodeLogicalTypeInUnion(GeneratedRecordTest.java:82) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at >
[jira] [Updated] (AVRO-1883) Schema validator cannot find broken backwards compatibility in Union type elements
[ https://issues.apache.org/jira/browse/AVRO-1883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1883: Resolution: Fixed Fix Version/s: 1.8.2 1.9.0 Status: Resolved (was: Patch Available) I committed this patch. Thanks for fixing this, [~Yibing]! (I tested this with AVRO-1908 also applied because it hasn't been merged yet.) > Schema validator cannot find broken backwards compatibility in Union type > elements > -- > > Key: AVRO-1883 > URL: https://issues.apache.org/jira/browse/AVRO-1883 > Project: Avro > Issue Type: Bug >Affects Versions: 1.8.1 >Reporter: Yibing Shi >Assignee: Yibing Shi >Priority: Critical > Fix For: 1.9.0, 1.8.2 > > Attachments: AVRO-1883.1.patch > > > Consider below 2 schemas: > *Schema 1*: > {noformat} > [ > { > "type": "record", > "name": "rec1", > "fields": [ > { > "name": "age", > "type": "long" > } > ] > }, > { > "type": "record", > "name": "rec2", > "fields": [ > { > "name": "username", > "type": "string" > } > ] > } > ] > {noformat} > *Schema 2*: > {noformat} > [ > { > "type": "record", > "name": "rec1", > "fields": [ > { > "name": "age", > "type": "long" > }, > { > "name": "address", > "type": "string" > } > ] > }, > { > "type": "record", > "name": "rec2", > "fields": [ > { > "name": "username", > "type": "string" > } > ] > } > ] > {noformat} > The {{rec1}} field in these 2 unions are not compatible, because the > {{address}} field of {{rec1}} in the second one is not nullable. However, if > we check them with validate like below, validator doesn't return any error: > {code} > final SchemaValidator backwardValidator = new > SchemaValidatorBuilder().canReadStrategy().validateLatest(); > final Schema schema1 = new Schema.Parser().parse(schema1Str); > final Schema schema2 = new Schema.Parser().parse(schema2Str); > backwardValidator.validate(schema2, Arrays.asList(schema1)); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1908) IPC TestSpecificCompiler build is broken
[ https://issues.apache.org/jira/browse/AVRO-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15463575#comment-15463575 ] Ryan Blue commented on AVRO-1908: - [~busbey], could you review? Thanks! > IPC TestSpecificCompiler build is broken > > > Key: AVRO-1908 > URL: https://issues.apache.org/jira/browse/AVRO-1908 > Project: Avro > Issue Type: Bug > Components: java >Reporter: Ryan Blue > > AVRO-1884 changed {{SpecificCompiler.makePath}} to {{private > SpecificCompiler#makePath}}, which broke building the tests in the IPC > package. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AVRO-1908) IPC TestSpecificCompiler build is broken
Ryan Blue created AVRO-1908: --- Summary: IPC TestSpecificCompiler build is broken Key: AVRO-1908 URL: https://issues.apache.org/jira/browse/AVRO-1908 Project: Avro Issue Type: Bug Components: java Reporter: Ryan Blue AVRO-1884 changed {{SpecificCompiler.makePath}} to {{private SpecificCompiler#makePath}}, which broke building the tests in the IPC package. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1874) py3 avro module import upsets logging level in host application
[ https://issues.apache.org/jira/browse/AVRO-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1874: Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for fixing this, [~torgebo]! I've merged your PR into master and it should be in the 1.8.2 release. > py3 avro module import upsets logging level in host application > --- > > Key: AVRO-1874 > URL: https://issues.apache.org/jira/browse/AVRO-1874 > Project: Avro > Issue Type: Bug > Components: python >Affects Versions: 1.8.1 > Environment: Mac OSX El Capitan, Macbook Pro, > Anaconda Python v. 3.5.1 > Avro installed from source of Avro1.8.1/lang/py3 > (apache package "avro-src-1.8.1.tar.gz") > using "sudo python setup.py install" >Reporter: Torgeir Børresen >Assignee: Torgeir Børresen >Priority: Critical > Fix For: 1.8.2 > > > When importing "avro.datafile" the logging level of the host application gets > overriden. > In the simple example provided here: > https://github.com/torgebo/avro-1.8.1-logging-break > the logging level is wrongfully set to "logging.WARNING" during execution > instead of "logging.INFO". > The issue seems to be resolved by using module level loggers in the pattern of > logger = logging.getLogger(__name__) > and replacing current calls to the logger named "logging" as this logger > "logger" instead. This approach is described here: > https://docs.python.org/3/howto/logging.html#logging-advanced-tutorial > When setting logger across all avro source files, it is observed that the > application sets the logging level faithfully. > > This issue was not observed with python version 2, although the recommended > way to resolve module level logging as described in the logging python docs > seems to be the same (ie. using the logging.getLogger method to access the > logger handle). > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1884) Add set source file suffix function for generate non-java file #90
[ https://issues.apache.org/jira/browse/AVRO-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1884: Resolution: Fixed Status: Resolved (was: Patch Available) Merged #90. Thanks for contributing, [~shijinkui]! > Add set source file suffix function for generate non-java file #90 > -- > > Key: AVRO-1884 > URL: https://issues.apache.org/jira/browse/AVRO-1884 > Project: Avro > Issue Type: Wish > Components: java >Reporter: shijinkui >Assignee: shijinkui > Fix For: 1.9.0, 1.8.2 > > > support generate non-java source file, example: > compiler.setSuffix(".scala") > compiler.setTemplateDir(templatePath) > compiler.compileToDestination(file, new File("src/main/scala/")) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1879) Make conversions field transient in compiled SpecificRecord
[ https://issues.apache.org/jira/browse/AVRO-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1879: Resolution: Fixed Status: Resolved (was: Patch Available) Merged #108. Thanks [~mwong]! > Make conversions field transient in compiled SpecificRecord > --- > > Key: AVRO-1879 > URL: https://issues.apache.org/jira/browse/AVRO-1879 > Project: Avro > Issue Type: Improvement > Components: java >Affects Versions: 1.8.1 >Reporter: Michael Wong >Assignee: Michael Wong > Fix For: 1.9.0, 1.8.2 > > Attachments: conversions-transient.patch > > > Add a transient modifier to the conversions field as such > {code} > private final transient org.apache.avro.Conversion[] conversions > {code} > This will allow Serializers and type inference systems like Flink's > TypeExtractor to not consider the {{conversions}} field and not try to > serialize it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1704: Resolution: Fixed Status: Resolved (was: Patch Available) I committed the last patch with the spec changes, which closes out this issue. Thanks [~nielsbasjes], [~cutting], [~busbey], and [~dasch] for making this happen! > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement > Components: java, spec >Reporter: Daniel Schierbeck >Assignee: Niels Basjes > Fix For: 1.9.0, 1.8.3 > > Attachments: AVRO-1704-2016-05-03-Unfinished.patch, > AVRO-1704-20160410.patch, AVRO-1704.3.patch, AVRO-1704.4.patch > > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15463478#comment-15463478 ] Ryan Blue commented on AVRO-1704: - Thanks for reviewing! > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement > Components: java, spec >Reporter: Daniel Schierbeck >Assignee: Niels Basjes > Fix For: 1.9.0, 1.8.3 > > Attachments: AVRO-1704-2016-05-03-Unfinished.patch, > AVRO-1704-20160410.patch, AVRO-1704.3.patch, AVRO-1704.4.patch > > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1848) Can't use null or false defaults in Ruby
[ https://issues.apache.org/jira/browse/AVRO-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1848: Resolution: Fixed Status: Resolved (was: Patch Available) I committed the fix, along with a test. Thanks for contributing, [~theturtle32]! > Can't use null or false defaults in Ruby > > > Key: AVRO-1848 > URL: https://issues.apache.org/jira/browse/AVRO-1848 > Project: Avro > Issue Type: Bug > Components: ruby >Affects Versions: 1.8.0 > Environment: Any >Reporter: Brian McKelvey >Assignee: Brian McKelvey >Priority: Critical > Labels: easyfix > Fix For: 1.8.2 > > Original Estimate: 2h > Remaining Estimate: 2h > > When calling {{to_avro}} on an {{Avro::Schema::Field}} instance (part of > calling {{to_avro}} on an instance of {{Avro::Schema::RecordSchema}}), it > will not include the default value definition if the default value is falsey. > The offending code is: > {code:ruby} > def to_avro(names=Set.new) > {'name' => name, 'type' => type.to_avro(names)}.tap do |avro| > avro['default'] = default if default > avro['order'] = order if order > end > end > {code} > Using the {{if default}} conditional predicate here is inappropriate, as is > relying on {{nil}} values to represent no default, because {{null}} in JSON > maps to {{nil}} in Ruby. > This is a critical show-stopper to using AvroTurf with the Confluent Schema > Registry because it is quietly uploading incorrect schemas, causing > downstream readers to behave incorrectly and also causing the schema registry > to reject new schema versions as incompatible when they are actually just > fine if the falsey default values are included when submitting the schema to > the registry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15461949#comment-15461949 ] Ryan Blue commented on AVRO-1704: - I'm marking this as a blocker for the 1.8.2 release because the code is committed. If we release the implementation, I think we should also include the spec changes. > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement > Components: java, spec >Reporter: Daniel Schierbeck >Assignee: Niels Basjes > Fix For: 1.9.0, 1.8.3 > > Attachments: AVRO-1704-2016-05-03-Unfinished.patch, > AVRO-1704-20160410.patch, AVRO-1704.3.patch, AVRO-1704.4.patch > > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15461947#comment-15461947 ] Ryan Blue commented on AVRO-1704: - [~busbey], could you have a look at the last patch I posted with the spec changes? I'd like to get it into 1.8.2 since the code is. Thank you! > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement > Components: java, spec >Reporter: Daniel Schierbeck >Assignee: Niels Basjes > Fix For: 1.9.0, 1.8.3 > > Attachments: AVRO-1704-2016-05-03-Unfinished.patch, > AVRO-1704-20160410.patch, AVRO-1704.3.patch, AVRO-1704.4.patch > > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1888) Java: Single-record encoding marker bytes check is incorrect
[ https://issues.apache.org/jira/browse/AVRO-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1888: Resolution: Fixed Status: Resolved (was: Patch Available) I committed this. Thanks for reviewing, [~busbey]! > Java: Single-record encoding marker bytes check is incorrect > > > Key: AVRO-1888 > URL: https://issues.apache.org/jira/browse/AVRO-1888 > Project: Avro > Issue Type: Bug > Components: java >Reporter: Ryan Blue >Assignee: Ryan Blue >Priority: Blocker > Fix For: 1.8.2 > > Attachments: AVRO-1888.1.patch > > > It looks like the check for correct marker bytes is incorrect. > The check should validate both marker/version bytes match what is expected, > but is instead this: > {code} > if (! (BinaryMessageEncoder.V1_HEADER[0] == header[0]) > && BinaryMessageEncoder.V1_HEADER[1] == header[1]) { . . . } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-607) SpecificData.getSchema not thread-safe
[ https://issues.apache.org/jira/browse/AVRO-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-607: --- Resolution: Fixed Status: Resolved (was: Patch Available) I committed the patch that uses Guava's cache. Thanks for this contribution, [~Andrius Driu]! > SpecificData.getSchema not thread-safe > -- > > Key: AVRO-607 > URL: https://issues.apache.org/jira/browse/AVRO-607 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.3.3, 1.8.1 >Reporter: Stephen Tu >Assignee: Andrius Druzinis-Vitkus >Priority: Blocker > Labels: newbie, patch > Fix For: 1.8.2 > > Attachments: > 0001-AVRO-607-Changed-SpecificData.getSchema-to-use-a-thr.patch, > AVRO-607.patch, AVRO-607.patch > > > SpecificData.getSchema uses a WeakHashMap to cache schemas, but WeakHashMap > is not thread-safe, and the method itself is not synchronized. Seems like > this could lead to the data structure getting corrupted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-607) SpecificData.getSchema not thread-safe
[ https://issues.apache.org/jira/browse/AVRO-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15461934#comment-15461934 ] Ryan Blue commented on AVRO-607: Sounds good to me. Thanks for replying! > SpecificData.getSchema not thread-safe > -- > > Key: AVRO-607 > URL: https://issues.apache.org/jira/browse/AVRO-607 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.3.3, 1.8.1 >Reporter: Stephen Tu >Assignee: Andrius Druzinis-Vitkus >Priority: Blocker > Labels: newbie, patch > Fix For: 1.8.2 > > Attachments: > 0001-AVRO-607-Changed-SpecificData.getSchema-to-use-a-thr.patch, > AVRO-607.patch, AVRO-607.patch > > > SpecificData.getSchema uses a WeakHashMap to cache schemas, but WeakHashMap > is not thread-safe, and the method itself is not synchronized. Seems like > this could lead to the data structure getting corrupted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (AVRO-1719) Avro fails to build against Boost 1.59.0
[ https://issues.apache.org/jira/browse/AVRO-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue resolved AVRO-1719. - Resolution: Fixed > Avro fails to build against Boost 1.59.0 > > > Key: AVRO-1719 > URL: https://issues.apache.org/jira/browse/AVRO-1719 > Project: Avro > Issue Type: Bug > Components: build >Affects Versions: 1.7.7 >Reporter: Tim Smith >Assignee: Romain Geissler >Priority: Blocker > Fix For: 1.7.8, 1.8.2 > > > Avro fails to build on OS X with Boost 1.59.0, dying on errors about > undeclared BOOST_ identifiers. > Build logs are here: > https://gist.github.com/anonymous/03736608223d42f45ab1#file-02-make-L180 > Homebrew is tracking packages which fail to build against Boost 1.59.0 at > https://github.com/Homebrew/homebrew/pull/42960. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1719) Avro fails to build against Boost 1.59.0
[ https://issues.apache.org/jira/browse/AVRO-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated AVRO-1719: Assignee: Romain Geissler > Avro fails to build against Boost 1.59.0 > > > Key: AVRO-1719 > URL: https://issues.apache.org/jira/browse/AVRO-1719 > Project: Avro > Issue Type: Bug > Components: build >Affects Versions: 1.7.7 >Reporter: Tim Smith >Assignee: Romain Geissler >Priority: Blocker > Fix For: 1.7.8, 1.8.2 > > > Avro fails to build on OS X with Boost 1.59.0, dying on errors about > undeclared BOOST_ identifiers. > Build logs are here: > https://gist.github.com/anonymous/03736608223d42f45ab1#file-02-make-L180 > Homebrew is tracking packages which fail to build against Boost 1.59.0 at > https://github.com/Homebrew/homebrew/pull/42960. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1719) Avro fails to build against Boost 1.59.0
[ https://issues.apache.org/jira/browse/AVRO-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15461915#comment-15461915 ] Ryan Blue commented on AVRO-1719: - I committed this. It should be safe because it only affects test code. > Avro fails to build against Boost 1.59.0 > > > Key: AVRO-1719 > URL: https://issues.apache.org/jira/browse/AVRO-1719 > Project: Avro > Issue Type: Bug > Components: build >Affects Versions: 1.7.7 >Reporter: Tim Smith >Priority: Blocker > Fix For: 1.7.8, 1.8.2 > > > Avro fails to build on OS X with Boost 1.59.0, dying on errors about > undeclared BOOST_ identifiers. > Build logs are here: > https://gist.github.com/anonymous/03736608223d42f45ab1#file-02-make-L180 > Homebrew is tracking packages which fail to build against Boost 1.59.0 at > https://github.com/Homebrew/homebrew/pull/42960. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1610) HttpTransceiver.java allocates arbitrary amount of memory
[ https://issues.apache.org/jira/browse/AVRO-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15456575#comment-15456575 ] Ryan Blue commented on AVRO-1610: - You can paste it in a comment, I can grab the raw comment source. You may also be able to surround it in a verbatim area, the docs for it are here: https://jira.atlassian.com/secure/WikiRendererHelpAction.jspa?section=advanced > HttpTransceiver.java allocates arbitrary amount of memory > - > > Key: AVRO-1610 > URL: https://issues.apache.org/jira/browse/AVRO-1610 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.7.7 >Reporter: Philip Zeyliger > > In {{HttpTransceiver.java}}, Avro does: > {code} > int length = (in.read()<<24)+(in.read()<<16)+(in.read()<<8)+in.read(); > if (length == 0) { // end of buffers > return buffers; > } > ByteBuffer buffer = ByteBuffer.allocate(length); > {code} > This means that badly formatted input (like that produced by {{curl > http://host/ --data foo}} and many common security scanners) will trigger an > OutOfMemory exception. This is undesirable, especially combined with setups > that kill the process on out of memory exceptions. > This bug is similar in spirit to AVRO-. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1610) HttpTransceiver.java allocates arbitrary amount of memory
[ https://issues.apache.org/jira/browse/AVRO-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15456456#comment-15456456 ] Ryan Blue commented on AVRO-1610: - The allocation is limited to Integer.MAX_VALUE, 2GB. That's big, but not necessarily going to crash services. If this becomes a problem then we can add a configurable limit, but otherwise I'm fine with the allocation proceeding. > HttpTransceiver.java allocates arbitrary amount of memory > - > > Key: AVRO-1610 > URL: https://issues.apache.org/jira/browse/AVRO-1610 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.7.7 >Reporter: Philip Zeyliger > > In {{HttpTransceiver.java}}, Avro does: > {code} > int length = (in.read()<<24)+(in.read()<<16)+(in.read()<<8)+in.read(); > if (length == 0) { // end of buffers > return buffers; > } > ByteBuffer buffer = ByteBuffer.allocate(length); > {code} > This means that badly formatted input (like that produced by {{curl > http://host/ --data foo}} and many common security scanners) will trigger an > OutOfMemory exception. This is undesirable, especially combined with setups > that kill the process on out of memory exceptions. > This bug is similar in spirit to AVRO-. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1610) HttpTransceiver.java allocates arbitrary amount of memory
[ https://issues.apache.org/jira/browse/AVRO-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15456373#comment-15456373 ] Ryan Blue commented on AVRO-1610: - Sounds like a good assessment of the problem to me. Would you like to build a patch to fix it? I'll review. > HttpTransceiver.java allocates arbitrary amount of memory > - > > Key: AVRO-1610 > URL: https://issues.apache.org/jira/browse/AVRO-1610 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.7.7 >Reporter: Philip Zeyliger > > In {{HttpTransceiver.java}}, Avro does: > {code} > int length = (in.read()<<24)+(in.read()<<16)+(in.read()<<8)+in.read(); > if (length == 0) { // end of buffers > return buffers; > } > ByteBuffer buffer = ByteBuffer.allocate(length); > {code} > This means that badly formatted input (like that produced by {{curl > http://host/ --data foo}} and many common security scanners) will trigger an > OutOfMemory exception. This is undesirable, especially combined with setups > that kill the process on out of memory exceptions. > This bug is similar in spirit to AVRO-. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1895) DeepCopy does not work with logical types
[ https://issues.apache.org/jira/browse/AVRO-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15453569#comment-15453569 ] Ryan Blue commented on AVRO-1895: - +1. Looks good to me. Thanks, [~cutting]! > DeepCopy does not work with logical types > - > > Key: AVRO-1895 > URL: https://issues.apache.org/jira/browse/AVRO-1895 > Project: Avro > Issue Type: Improvement > Components: logical types >Affects Versions: 1.8.1 >Reporter: Taras Bobrovytsky >Assignee: Doug Cutting >Priority: Critical > Attachments: AVRO-1895.patch, AVRO-1895.patch, AVRO-1895.patch, > AVRO-1895.patch > > > AvroSchema is taken from a compiled avsc file which contains a decimal field. > {code} > AvroSchema.Builder builder = AvroSchema.newBuilder(); > BigDecimal bd = new BigDecimal(new BigInteger("155"), 3); > campaignBuilder.setDecimalField(bd); > AvroSchema source = builder.build(); > //This line causes an exception > AvroSchema.Builder builder1 = AvroSchema.newBuilder(source); > {code} > Exception: > {code} > InvocationTargetException: java.math.BigDecimal cannot be cast to > java.nio.ByteBuffer > {code} > The same failure happens with GenericData as well: > {code} > GenericRecord copy = GenericData.get().deepCopy(AvroSchema.getClassSchema(), > source); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1895) DeepCopy does not work with logical types
[ https://issues.apache.org/jira/browse/AVRO-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15453495#comment-15453495 ] Ryan Blue commented on AVRO-1895: - I think this should use the {{getConversionByClass}} method for generic. I'd be fine with assuming the right class is used for specific, but generic allows users to set objects and I don't think we should require that the class is always the "right" one if there is a way to choose the right conversion for the object that gets passed in. > DeepCopy does not work with logical types > - > > Key: AVRO-1895 > URL: https://issues.apache.org/jira/browse/AVRO-1895 > Project: Avro > Issue Type: Improvement > Components: logical types >Affects Versions: 1.8.1 >Reporter: Taras Bobrovytsky >Assignee: Doug Cutting >Priority: Critical > Attachments: AVRO-1895.patch, AVRO-1895.patch, AVRO-1895.patch > > > AvroSchema is taken from a compiled avsc file which contains a decimal field. > {code} > AvroSchema.Builder builder = AvroSchema.newBuilder(); > BigDecimal bd = new BigDecimal(new BigInteger("155"), 3); > campaignBuilder.setDecimalField(bd); > AvroSchema source = builder.build(); > //This line causes an exception > AvroSchema.Builder builder1 = AvroSchema.newBuilder(source); > {code} > Exception: > {code} > InvocationTargetException: java.math.BigDecimal cannot be cast to > java.nio.ByteBuffer > {code} > The same failure happens with GenericData as well: > {code} > GenericRecord copy = GenericData.get().deepCopy(AvroSchema.getClassSchema(), > source); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1895) DeepCopy does not work with logical types
[ https://issues.apache.org/jira/browse/AVRO-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412011#comment-15412011 ] Ryan Blue commented on AVRO-1895: - bq. The workaround in AVRO-1891 may not help here We should find out. > DeepCopy does not work with logical types > - > > Key: AVRO-1895 > URL: https://issues.apache.org/jira/browse/AVRO-1895 > Project: Avro > Issue Type: Improvement > Components: logical types >Affects Versions: 1.8.1 >Reporter: Taras Bobrovytsky > > AvroSchema is taken from a compiled avsc file which contains a decimal field. > {code} > AvroSchema.Builder builder = AvroSchema.newBuilder(); > BigDecimal bd = new BigDecimal(new BigInteger("155"), 3); > campaignBuilder.setDecimalField(bd); > AvroSchema source = builder.build(); > //This line causes an exception > AvroSchema.Builder builder1 = AvroSchema.newBuilder(source); > {code} > Exception: > {code} > InvocationTargetException: java.math.BigDecimal cannot be cast to > java.nio.ByteBuffer > {code} > The same failure happens with GenericData as well: > {code} > GenericRecord copy = GenericData.get().deepCopy(AvroSchema.getClassSchema(), > source); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1894) GenericData does not add Logical Type conversions by default
[ https://issues.apache.org/jira/browse/AVRO-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408395#comment-15408395 ] Ryan Blue commented on AVRO-1894: - I don't think that, in general, adding conversions by default is a good idea. There's currently no method to remove a conversion, and no guarantee that people wouldn't use different ones. For example, the date/time conversions use Joda types but you might want to use different classes for Java 8. There's also some cost to conversion lookups. Is there something we could do to make this more obvious instead of adding default conversions? What about better errors when the wrong types are used? > GenericData does not add Logical Type conversions by default > > > Key: AVRO-1894 > URL: https://issues.apache.org/jira/browse/AVRO-1894 > Project: Avro > Issue Type: Improvement >Affects Versions: 1.8.1 >Reporter: Taras Bobrovytsky > > In order to support Decimal fields, the following line needs to be added: > {code} > new GenericData().addLogicalTypeConversion(new > Conversions.DecimalConversion()); > {code} > It would be more convenient if logical type conversions were added by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)