[jira] [Updated] (AVRO-2028) Specific Data, newBuilder(existingInstance) fails for BigDecimal

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2028:
--
Component/s: java

> Specific Data, newBuilder(existingInstance) fails for BigDecimal
> 
>
> Key: AVRO-2028
> URL: https://issues.apache.org/jira/browse/AVRO-2028
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Adrian McCague
>Priority: Major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Using 1.8.2-rc3
> When attempting to use:
> {{MyType.newBuilder(existingType)}}
> With a field of type {{Union}}:
> {code}
> ["null",{"type":"bytes","logicalType":"decimal","precision":20,"scale":8}]
> {code}
> I get the following exception:
> {code}
> org.apache.avro.UnresolvedUnionException: Not in union 
> ["null",{"type":"bytes","logicalType":"decimal","precision":20,"scale":8}]: 
> 12000.
>   at 
> org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:740)
>   at 
> org.apache.avro.generic.GenericData.deepCopyRaw(GenericData.java:1146)
>   at org.apache.avro.generic.GenericData.deepCopy(GenericData.java:1062)
>   at MyType$Builder.
> {code}
> I suspected it may be related but also noticed in the generated Specific Data 
> class the following:
> {code}
> private static final org.apache.avro.Conversion[] conversions =
>   new org.apache.avro.Conversion[] {
>   null,
>   TIMESTAMP_CONVERSION,
>   null,
>   null, // Should be DECIMAL_CONVERSION
>   null,
>   null,
>   null,
>   null,
>   null
>   };
> {code}
> ie, the conversion was missing.
> Adding this by hand however did not resolve the issue. I will add this as 
> another issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2290) TestSpecificLogicalTypes.testRecordWithJsr310LogicalTypes breaks on Java 11

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2290:
--
Component/s: java

> TestSpecificLogicalTypes.testRecordWithJsr310LogicalTypes breaks on Java 11
> ---
>
> Key: AVRO-2290
> URL: https://issues.apache.org/jira/browse/AVRO-2290
> Project: Apache Avro
>  Issue Type: Sub-task
>  Components: java
>Reporter: Ismaël Mejía
>Priority: Minor
>
> Java 11 has nano-second precision so the test breaks, the test should now 
> detect this and adapt accordingly.
> {code}
> [ERROR] 
> testRecordWithJsr310LogicalTypes(org.apache.avro.specific.TestSpecificLo
> gicalTypes)  Time elapsed: 0.24 s  <<< FAILURE!
> java.lang.AssertionError: Should match written record expected:<{"b": true, 
> "i32
> ": 34, "i64": 35, "f32": 3.14, "f64": 3019.34, "s": null, "d": 2018-12-20, 
> "t": 
> 15:02:53.535103, "ts": 2018-12-20T14:02:53.535127Z, "dec": 123.45}> but 
> was:<{"b
> ": true, "i32": 34, "i64": 35, "f32": 3.14, "f64": 3019.34, "s": null, "d": 
> 2018
> -12-20, "t": 15:02:53.535, "ts": 2018-12-20T14:02:53.535Z, "dec": 123.45}>
> at 
> org.apache.avro.specific.TestSpecificLogicalTypes.testRecordWithJsr31
> 0LogicalTypes(TestSpecificLogicalTypes.java:132)
> [ERROR] 
> testAbilityToReadJodaRecordWrittenAsJsr310Record(org.apache.avro.specifi
> c.TestSpecificLogicalTypes)  Time elapsed: 0.005 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expected: is "15:02:53.639158"
>  but: was "15:02:53.639"
> at 
> org.apache.avro.specific.TestSpecificLogicalTypes.testAbilityToReadJo
> daRecordWrittenAsJsr310Record(TestSpecificLogicalTypes.java:204)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1890) Java objects compiled from AVDL disregard default values

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1890:
--
Component/s: java

> Java objects compiled from AVDL disregard default values
> 
>
> Key: AVRO-1890
> URL: https://issues.apache.org/jira/browse/AVRO-1890
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Reporter: Ryon Day
>Priority: Minor
>
> Given the schema
> {code}
> @namespace("fasdf")
> protocol foo {
> enum Domain {
> COM,
> ORG,
> BIZ,
> MIL
> }
> record Wat {
> Domain domain = "COM";
> string foo = "asfdasdf";
> }
> }
> {code}
> The resulting {{Wat}} Java class does not use the defined default values:
> {code}
> @Test
> public void asdfasdf() throws Exception {
> fasdf.Wat x = new Wat();
> System.out.print(x);
> }
> {code}
> This test results in:
> {code}
> {"domain": null, "foo": null}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2128) Schema parsing in the Java library is more permissive than the C implementation or the JSON specification

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2128:
--
Component/s: java

> Schema parsing in the Java library is more permissive than the C 
> implementation or the JSON specification
> -
>
> Key: AVRO-2128
> URL: https://issues.apache.org/jira/browse/AVRO-2128
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Reporter: Zoltan Ivanfi
>Priority: Major
>
> When parsing schemas, the Java library accepts C-style comments (which are 
> forbidden in JSON) and is unaffected by trailing garbage (parsing stops as 
> soon as it reaches the end of the JSON structure).
> In the C library, however, comments and trailing whitspaces cause an error.
> If a schema is accepted by one language binding, it should be accepted by the 
> other as well. The schema should also be valid JSON. It's the Java library 
> that does not enforce this by being more permissive than it should be, so it 
> seems that the Java implementation should be changed. However, we must also 
> consider whether making the Java library stricter at this point would make 
> any existing data unreadable.
> Fortunately, the schema that is written in the data files themselves is 
> always valid JSON, even if it is based on a non-JSON-conformant schema. The 
> reason for this is that Java library parses the schema, build an in-memory 
> representation and then reserializes that, thereby removing comments and 
> trailing garbage. So existing data files are not affected, only user-supplied 
> schemas. These can be manually updated (unlike existing data files).
> The real-world use-case where this discrepancy causes problems is Hive-Impala 
> interaction. Users can create tables in Hive by supplying an Avro schema. 
> That schema will be associated with the whole table by getting saved in the 
> Hive metastore. Impala also consults this metadata when accessing the table 
> and that causes an error in the Avro C library that Impala uses. This is 
> detailed in IMPALA-1024. In particular, [this 
> comment|https://issues.apache.org/jira/browse/IMPALA-1024?focusedCommentId=16261702=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16261702]
>  contains a lot of relevant information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2284) Incorrect EnumSymbol initialization in TestReadingWritingDataInEvolvedSchemas.java

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2284:
--
Component/s: java

> Incorrect EnumSymbol initialization in 
> TestReadingWritingDataInEvolvedSchemas.java
> --
>
> Key: AVRO-2284
> URL: https://issues.apache.org/jira/browse/AVRO-2284
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Zoltan Farkas
>Priority: Minor
>
> EnumSymbol is initialized with Record schema instead of Enum schema at:
> https://github.com/apache/avro/blob/master/lang/java/avro/src/test/java/org/apache/avro/TestReadingWritingDataInEvolvedSchemas.java#L310



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2068) Improve EnumSchema constructor performance

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2068:
--
Component/s: java

> Improve EnumSchema constructor performance
> --
>
> Key: AVRO-2068
> URL: https://issues.apache.org/jira/browse/AVRO-2068
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Zoltan Farkas
>Priority: Trivial
>
> at 
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/Schema.java#L745
>  :
> {code}
> 
>   private static class EnumSchema extends NamedSchema {
> private final List symbols;
> private final Map ordinals;
> public EnumSchema(Name name, String doc,
> LockableArrayList symbols) {
>   super(Type.ENUM, name, doc);
>   this.symbols = symbols.lock();
>   this.ordinals = new HashMap();
>   int i = 0;
>   for (String symbol : symbols)
> if (ordinals.put(validateName(symbol), i++) != null)
>   throw new SchemaParseException("Duplicate enum symbol: "+symbol);
> }
> 
> {code}
> should be changed to:
> {code}
> 
>   private static class EnumSchema extends NamedSchema {
> private final List symbols;
> private final Map ordinals;
> public EnumSchema(Name name, String doc,
> LockableArrayList symbols) {
>   super(Type.ENUM, name, doc);
>   this.symbols = symbols.lock();
>   this.ordinals = new HashMap(symbols.size());
>   int i = 0;
>   for (String symbol : symbols)
> if (ordinals.put(validateName(symbol), i++) != null)
>   throw new SchemaParseException("Duplicate enum symbol: "+symbol);
> }
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1852) Make org.apache.avro.Schema serializable (java.io.Serializable)

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1852:
--
Component/s: java

> Make org.apache.avro.Schema serializable (java.io.Serializable)
> ---
>
> Key: AVRO-1852
> URL: https://issues.apache.org/jira/browse/AVRO-1852
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Zoltan Farkas
>Priority: Minor
>
> here is a commit describing the implementation: 
> https://github.com/zolyfarkas/avro/commit/867f4d6a0f2e65a4ca8084f02b0d704a3acdb9d0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2031) GenericData.writeEscapedString should be static

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2031:
--
Component/s: java

> GenericData.writeEscapedString should be static
> ---
>
> Key: AVRO-2031
> URL: https://issues.apache.org/jira/browse/AVRO-2031
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Zoltan Farkas
>Priority: Trivial
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2070) Tolerate any Number when writing primitive values in Java in GenericDatumWriter

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2070:
--
Component/s: java

> Tolerate any Number when writing primitive values in Java in 
> GenericDatumWriter
> ---
>
> Key: AVRO-2070
> URL: https://issues.apache.org/jira/browse/AVRO-2070
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Daniil Gitelson
>Priority: Major
>
> Tolerating any Number (instead of concrete Long, Double, Float) makes 
> possible to use mutable Number implmentation for performance reasons 
> (specially for primitive collection iterations)
> Currently, this only works for int only:
> {code:java}
>   // Here it works
>   case INT: out.writeInt(((Number)datum).intValue()); break;
>   // This should be replaced with ((Number)datum).longValue() etc
>   case LONG:out.writeLong((Long)datum);   break;
>   case FLOAT:   out.writeFloat((Float)datum); break;
>   case DOUBLE:  out.writeDouble((Double)datum);   break;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2278) GenericData.Record field getter not correct

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2278:
--
Component/s: java

> GenericData.Record field getter not correct
> ---
>
> Key: AVRO-2278
> URL: https://issues.apache.org/jira/browse/AVRO-2278
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Zoltan Farkas
>Priority: Major
>
> Currently the get field implementation is not correct in GenericData.Record:
> at: 
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java#L209
> {code}
>@Override public Object get(String key) {
>   Field field = schema.getField(key);
>   if (field == null) return null;
>   return values[field.pos()];
> }
> {code}
> The method returns null when a field is not present, making it impossible to 
> distinguish between:
> field value = null
> and
> field does not exist.
> A more "correct" implementation would be:
> {code}
> @Override public Object get(String key) {
>   Field field = schema.getField(key);
>   if (field == null) {
> throw new IllegalArgumentException("Invalid field " + key);
>   }
>   return values[field.pos()];
> }
> {code}
> this will make the behavior consistent with put which will throw a exception 
> when setting a non existent field.
> when I make this change in my fork, some bugs in unit tests showed up



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2057) JsonDecoder.skipChildren does not skip map/records correctly

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2057:
--
Component/s: java

> JsonDecoder.skipChildren does not skip map/records correctly
> 
>
> Key: AVRO-2057
> URL: https://issues.apache.org/jira/browse/AVRO-2057
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Zoltan Farkas
>Priority: Critical
>
> at 
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/io/JsonDecoder.java#L585
> {code}
>   @Override
>   public JsonParser skipChildren() throws IOException {
> JsonToken tkn = elements.get(pos).token;
> int level = (tkn == JsonToken.START_ARRAY || tkn == 
> JsonToken.END_ARRAY) ? 1 : 0;
> while (level > 0) {
>   switch(elements.get(++pos).token) {
>   case START_ARRAY:
>   case START_OBJECT:
> level++;
> break;
>   case END_ARRAY:
>   case END_OBJECT:
> level--;
> break;
>   }
> }
> return this;
>   }
> {code}
> should be:
> {code}
>   @Override
>   public JsonParser skipChildren() throws IOException {
> JsonToken tkn = elements.get(pos).token;
> int level = (tkn == JsonToken.START_ARRAY || tkn == 
> JsonToken.START_OBJECT) ? 1 : 0;
> while (level > 0) {
>   switch(elements.get(++pos).token) {
>   case START_ARRAY:
>   case START_OBJECT:
> level++;
> break;
>   case END_ARRAY:
>   case END_OBJECT:
> level--;
> break;
>   }
> }
> return this;
>   }
> {code}
> This results in de-serialization failures when the reader schema does not 
> have fields that are present in the serialized object and the writer schema. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1953) ArrayIndexOutOfBoundsException in org.apache.avro.io.parsing.Symbol$Alternative.getSymbol

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1953:
--
Component/s: java

> ArrayIndexOutOfBoundsException in 
> org.apache.avro.io.parsing.Symbol$Alternative.getSymbol
> -
>
> Key: AVRO-1953
> URL: https://issues.apache.org/jira/browse/AVRO-1953
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.4
>Reporter: Yong Zhang
>Priority: Major
>
> We are facing an issue when Avro MapReducer cannot process the avro file in 
> the reducer. 
> Here is the schema of our data:
> {
> "namespace" : "our package name",
> "type" : "record",
> "name" : "Lists",
> "fields" : [
> {"name" : "account_id", "type" : "long"},
> {"name" : "list_id", "type" : "string"},
> {"name" : "sequence_id", "type" : ["int", "null"]} ,
> {"name" : "name", "type" : ["string", "null"]},
> {"name" : "state", "type" : ["string", "null"]},
> {"name" : "description", "type" : ["string", "null"]},
> {"name" : "dynamic_filtered_list", "type" : ["int", "null"]},
> {"name" : "filter_criteria", "type" : ["string", "null"]},
> {"name" : "created_at", "type" : ["long", "null"]},
> {"name" : "updated_at", "type" : ["long", "null"]},
> {"name" : "deleted_at", "type" : ["long", "null"]},
> {"name" : "favorite", "type" : ["int", "null"]},
> {"name" : "delta", "type" : ["boolean", "null"]},
> {
> "name" : "list_memberships", "type" : {
> "type" : "array", "items" : {
> "name" : "ListMembership", "type" : "record",
> "fields" : [
> {"name" : "channel_id", "type" : "string"},
> {"name" : "created_at", "type" : ["long", "null"]},
> {"name" : "created_source", "type" : ["string", 
> "null"]},
> {"name" : "deleted_at", "type" : ["long", "null"]},
> {"name" : "sequence_id", "type" : ["int", "null"]}
> ]
> }
> }
> }
> ]
> }
> Our MapReduce job is to get the delta of the above dataset, and use our merge 
> logic to merge the latest change into the dataset.
> The whole MR job runs daily, and work fine for 18 months. During this time, 
> we saw 2 times the merge MapReduce job failed with following error (In the 
> reducer stage, which means the Avro data being read successfully, and send to 
> the reducers, which we sort the data based on the key and timestamp, so the 
> delta can be merged in the reducer side):
> java.lang.ArrayIndexOutOfBoundsException at 
> org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364) at 
> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) at 
> org.apache.avro.io.parsing.Parser.advance(Parser.java:88) at 
> org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206) at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:152) 
> at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:177)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:148) 
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139) 
> at 
> org.apache.avro.hadoop.io.AvroDeserializer.deserialize(AvroDeserializer.java:108)
>  at 
> org.apache.avro.hadoop.io.AvroDeserializer.deserialize(AvroDeserializer.java:48)
>  at 
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:142)
>  at 
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:117)
>  at 
> org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:297)
>  at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:165) at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:652) at 
> org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420) at 
> org.apache.hadoop.mapred.Child$4.run(Child.java:255) at 
> java.security.AccessController.doPrivileged(AccessController.java:366) at 
> javax.security.auth.Subject.doAs(Subject.java:572) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1502)
>  at org.apache.hadoop.mapred.Child.main(Child.java:249)
> The MapReducer job will fail eventually in the reducer stage. I don't think 
> our data is corrupted, as they are read fine in the map stage. Every time we 
> got this error, we have to get the whole huge dataset from the source, then 
> rebuilt the AVRO, and start building merge again daily, until after several 
> months, then face this issue 

[jira] [Updated] (AVRO-2138) org.apache.avro.mapreduce.AvroMultipleOutputs.write copies Configuration on every invocation of write

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2138:
--
Component/s: java

> org.apache.avro.mapreduce.AvroMultipleOutputs.write copies Configuration on 
> every invocation of write
> -
>
> Key: AVRO-2138
> URL: https://issues.apache.org/jira/browse/AVRO-2138
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Stan Rosenberg
>Priority: Major
>
> While profiling a spark job using AvroMultipleOutputs, I noticed that a great 
> deal of time is wasted by copying (hadoop) Configuration.  Indeed this 
> happens on _every_ invocation of {{write}}: 
> [https://github.com/apache/avro/blob/master/lang/java/mapred/src/main/java/org/apache/avro/mapreduce/AvroMultipleOutputs.java#L437]
> After patching, I am seeing a speed-up of 2x and above in running time of the 
> same job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2145) Can't generate Javadoc on master

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2145:
--
Component/s: java

> Can't generate Javadoc on master
> 
>
> Key: AVRO-2145
> URL: https://issues.apache.org/jira/browse/AVRO-2145
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.9.0
>Reporter: Nandor Kollar
>Priority: Major
>
> {{mvn javadoc:aggregate}} fails with a bunch of Javadoc warnings on master 
> when building with JDK8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1831) Take advantage of JSR 3.5 annotations in the generated java classes.

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1831:
--
Component/s: java

> Take advantage of JSR 3.5 annotations in the generated java classes.
> 
>
> Key: AVRO-1831
> URL: https://issues.apache.org/jira/browse/AVRO-1831
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Zoltan Farkas
>Priority: Minor
>
> it would be nice if the generated records would take advantage of:
> @javax.annotation.Nullable
> @javax.annotation.Nonnull
> to annotate the fields that can be null...
> This would have a documenting role, and more importantly allow windbags to 
> detect incorrect use of the fields.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1698) cant serialize json with characters >127 when compiling with signed char

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1698:
--
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Fix for AVRO-1190 will automatically take care of this issue.

> cant serialize json with characters >127 when compiling with signed char
> 
>
> Key: AVRO-1698
> URL: https://issues.apache.org/jira/browse/AVRO-1698
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.7
> Environment: windows, linux w signed char
>Reporter: svante karlsson
>Priority: Major
> Attachments: AVRO-1698.1.patch
>
>
> iscntrl assumes 0-255 but signed char gets expanded to bad things
> pullreq #38  on github solves the issue.
> change line 196 lang/c++/impl/json/JsonIO.hh
> from 
>   if (! iscntrl(*p)) {
> to
>  if (! iscntrl((uint8_t) *p)) {



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1698) cant serialize json with characters >127 when compiling with signed char

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730784#comment-16730784
 ] 

Thiruvalluvan M. G. commented on AVRO-1698:
---

Fix for AVRO-1190 will automatically take care of this and hence will resolve 
this as a duplicate of AVRO-1190.

> cant serialize json with characters >127 when compiling with signed char
> 
>
> Key: AVRO-1698
> URL: https://issues.apache.org/jira/browse/AVRO-1698
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.7
> Environment: windows, linux w signed char
>Reporter: svante karlsson
>Priority: Major
> Attachments: AVRO-1698.1.patch
>
>
> iscntrl assumes 0-255 but signed char gets expanded to bad things
> pullreq #38  on github solves the issue.
> change line 196 lang/c++/impl/json/JsonIO.hh
> from 
>   if (! iscntrl(*p)) {
> to
>  if (! iscntrl((uint8_t) *p)) {



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-2037) Use std::any where available

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. resolved AVRO-2037.
---
Resolution: Fixed

Merged the pull request.

> Use std::any where available
> 
>
> Key: AVRO-2037
> URL: https://issues.apache.org/jira/browse/AVRO-2037
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Reporter: Darryl Green
>Assignee: Thiruvalluvan M. G.
>Priority: Major
>
> The use of boost::any to hold union types causes a significant performance 
> hit especially for small types - in particular the when using 
> [null,primitive] for optional primitive type elements of a schema. Most 
> (all?) implementations of std::any include a small value optimisation that 
> avoids allocation overhead for scalars and other small types. Its a little 
> unfortunate that the performance of a C++ binding of a notionally high 
> performance serialization format performs so poorly in this case (note - I 
> had previously proposed using boost::variant which would address this problem 
> but would fail to support recursive types or truly huge numbers of distinct 
> types in a union). Obviously this requires C++ 17 but could fall back to 
> boost::any for older compilers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AVRO-2295) Move C++ to std from boost wherever possible

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)
Thiruvalluvan M. G. created AVRO-2295:
-

 Summary: Move C++ to std from boost wherever possible
 Key: AVRO-2295
 URL: https://issues.apache.org/jira/browse/AVRO-2295
 Project: Apache Avro
  Issue Type: Improvement
  Components: c++
Reporter: Thiruvalluvan M. G.


Now that we have mandated C++11 as a requirement for 1.9.0 onwards, the 
following boost features can be moved to \{{std::}}:
 * array
 * scoped_ptr (in favor of unique_ptr)
 * shared_ptr
 * static_assert
 * type_traits
 * weak_ptr
 * noncopyable (in favor of {{= delete}} for copy constructors)
 * ptr_container (in favor of container of unique_ptr)

With that the only boost features still in use will be:
 * any
 * blank
 * format
 * iostreams
 * regex
 * program_options

Of these any is part of \{{C++ }}17 and hence when \{{C++ }}17 is used we can 
use {{std::}} for that too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AVRO-1936) avrogencpp, includes should have more guards or generate more headers

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730381#comment-16730381
 ] 

Thiruvalluvan M. G. edited comment on AVRO-1936 at 12/28/18 5:09 PM:
-

Even though the mechanism of using individual guards for each generated type 
would work, it is not a common idiom in {{C++ }}. In a single project there 
would be multiple definitions of the same type (in multiple header files) and 
exactly one of them would be used based on the order of {{include}}s. If these 
multiple definitions are not identical, (it is possible because they are 
generated using different schema files) the "One definition rule" of {{C++ }} 
would be broken. Also, IDEs will have difficulty in locating the definition for 
a given type.

I think the other approach provided namely, generating separate header files 
for each type (AVRO-1370) is a better idea. Here the developer has explicit 
control by deciding to {{include}} specific header or move the header files 
around to suit their needs. A negative of this approach is that build will be 
slow because of including several files instead of a single file previously.

Yet another approach is to allow generating header files from multiple schema 
files at once where de-duping is done by the code generator.


was (Author: thiru_mg):
Even though the mechanism of using individual guards for each generated type 
would work, it is not a common idiom in \{{C++}}. In a single project there 
would be multiple definitions of the same type (in multiple header files) and 
exactly one of them would be used based on the order of \{{include}}s. If these 
multiple definitions are not identical, (it is possible because they are 
generated using different schema files) the "One definition rule" of +{{+C++}} 
would be broken. Also, IDEs will have difficulty in locating the definition for 
a given type.

I think the other approach provided namely, generating separate header files 
for each type (AVRO-1370) is a better idea. Here the developer has explicit 
control by deciding to {{include}} specific header or move the header files 
around to suit their needs. A negative of this approach is that build will be 
slow because of including several files instead of a single file previously.

Yet another approach is to allow generating header files from multiple schema 
files at once where de-duping is done by the code generator.

> avrogencpp, includes should have more guards or generate more headers
> -
>
> Key: AVRO-1936
> URL: https://issues.apache.org/jira/browse/AVRO-1936
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.8.1
>Reporter: Alexander Moriarty
>Priority: Major
>
> Inside of an avdl file, one can include other avdl files. But the generated 
> only one header file is generated and it does not include guard the 
> enums/structs which were defined in the other avdl files.
> I have some basic records which I've defined in there own avdl files, and 
> include them inside of more complicated structures.
> All is well, until I try to include multiple of the avro generated header 
> files.
> Inside of your AvrogencppTests you have gotten around this by giving each 
> generated type there own name space.
> As a test, I quickly modified the existing avrogencpp.cc to include an 
> optional name to CodeGen::guard.
> {code:none}
> std::string guard(const string& name="");
> [...]
> string CodeGen::guard(const string& name)
> {
> string h = name.empty() ? headerFile_ : name;
> makeCanonical(h, true);
> return h + "_" + lexical_cast(random_()) + "__H_";
> }
> {code}
> And then adding guards around each Enum, Record, Union, Traits, etc.
> Which works well enough. However... the guards do not include the namespace 
> names, so this change breaks your unit tests.
> As long as two higher level classes in the same namespace do not include the 
> same subclasses the generated header files can both be used, but if you have 
> a basic data type like an Point(x,y) which is used throughout the higher 
> level classes then they will both redefine Point(x,y)
> On the Java side, everything is okay. Point(x,y) and all of the classes which 
> include Point are in their own files inside of a package.
> Is there any common way around this problem?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AVRO-1936) avrogencpp, includes should have more guards or generate more headers

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730381#comment-16730381
 ] 

Thiruvalluvan M. G. edited comment on AVRO-1936 at 12/28/18 5:06 PM:
-

Even though the mechanism of using individual guards for each generated type 
would work, it is not a common idiom in \{{C++}}. In a single project there 
would be multiple definitions of the same type (in multiple header files) and 
exactly one of them would be used based on the order of \{{include}}s. If these 
multiple definitions are not identical, (it is possible because they are 
generated using different schema files) the "One definition rule" of +{{+C++}} 
would be broken. Also, IDEs will have difficulty in locating the definition for 
a given type.

I think the other approach provided namely, generating separate header files 
for each type (AVRO-1370) is a better idea. Here the developer has explicit 
control by deciding to {{include}} specific header or move the header files 
around to suit their needs. A negative of this approach is that build will be 
slow because of including several files instead of a single file previously.

Yet another approach is to allow generating header files from multiple schema 
files at once where de-duping is done by the code generator.


was (Author: thiru_mg):
Even though the mechanism of using individual guards for each generated type 
would work, it is not a common idiom in C++. In a single project there would be 
multiple definitions of the same type (in multiple header files) and exactly 
one of them would be used based on the order of \{{include}}s. If these 
multiple definitions are not identical, (it is possible because they are 
generated using different schema files) the "One definition rule" of C++ would 
be broken. Also, IDEs will have difficulty in locating the definition for a 
given type.

I think the other approach provided namely, generating separate header files 
for each type (AVRO-1370) is a better idea. Here the developer has explicit 
control by deciding to {{include}} specific header or move the header files 
around to suit their needs. A negative of this approach is that build will be 
slow because of including several files instead of a single file previously.

Yet another approach is to allow generating header files from multiple schema 
files at once where de-duping is done by the code generator.

> avrogencpp, includes should have more guards or generate more headers
> -
>
> Key: AVRO-1936
> URL: https://issues.apache.org/jira/browse/AVRO-1936
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.8.1
>Reporter: Alexander Moriarty
>Priority: Major
>
> Inside of an avdl file, one can include other avdl files. But the generated 
> only one header file is generated and it does not include guard the 
> enums/structs which were defined in the other avdl files.
> I have some basic records which I've defined in there own avdl files, and 
> include them inside of more complicated structures.
> All is well, until I try to include multiple of the avro generated header 
> files.
> Inside of your AvrogencppTests you have gotten around this by giving each 
> generated type there own name space.
> As a test, I quickly modified the existing avrogencpp.cc to include an 
> optional name to CodeGen::guard.
> {code:none}
> std::string guard(const string& name="");
> [...]
> string CodeGen::guard(const string& name)
> {
> string h = name.empty() ? headerFile_ : name;
> makeCanonical(h, true);
> return h + "_" + lexical_cast(random_()) + "__H_";
> }
> {code}
> And then adding guards around each Enum, Record, Union, Traits, etc.
> Which works well enough. However... the guards do not include the namespace 
> names, so this change breaks your unit tests.
> As long as two higher level classes in the same namespace do not include the 
> same subclasses the generated header files can both be used, but if you have 
> a basic data type like an Point(x,y) which is used throughout the higher 
> level classes then they will both redefine Point(x,y)
> On the Java side, everything is okay. Point(x,y) and all of the classes which 
> include Point are in their own files inside of a package.
> Is there any common way around this problem?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AVRO-1936) avrogencpp, includes should have more guards or generate more headers

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730381#comment-16730381
 ] 

Thiruvalluvan M. G. edited comment on AVRO-1936 at 12/28/18 5:04 PM:
-

Even though the mechanism of using individual guards for each generated type 
would work, it is not a common idiom in C++. In a single project there would be 
multiple definitions of the same type (in multiple header files) and exactly 
one of them would be used based on the order of \{{include}}s. If these 
multiple definitions are not identical, (it is possible because they are 
generated using different schema files) the "One definition rule" of C++ would 
be broken. Also, IDEs will have difficulty in locating the definition for a 
given type.

I think the other approach provided namely, generating separate header files 
for each type (AVRO-1370) is a better idea. Here the developer has explicit 
control by deciding to {{include}} specific header or move the header files 
around to suit their needs. A negative of this approach is that build will be 
slow because of including several files instead of a single file previously.

Yet another approach is to allow generating header files from multiple schema 
files at once where de-duping is done by the code generator.


was (Author: thiru_mg):
Even though the mechanism of using individual guards for each generated type 
would work, it is not a common idiom in C++. In a single project there would be 
multiple definitions of the same type (in multiple header files) and exactly 
one of them would be used based on the order of {{include}}s. If these multiple 
definitions are not identical, (it is possible because they are generated using 
different schema files) the "One definition rule" of C++ would be broken. Also, 
IDEs will have difficulty in locating the definition for a given type.

I think the other approach provided namely, generating separate header files 
for each type (AVRO-1370) is a better idea. Here the developer has explicit 
control by deciding to {{include}} specific header or move the header files 
around to suit their needs. A negative of this approach is that build will be 
slow because of including several files instead of a single file previously.

Yet another approach is to allow generating header files from multiple schema 
files at once where de-duping is done by the code generator.

> avrogencpp, includes should have more guards or generate more headers
> -
>
> Key: AVRO-1936
> URL: https://issues.apache.org/jira/browse/AVRO-1936
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.8.1
>Reporter: Alexander Moriarty
>Priority: Major
>
> Inside of an avdl file, one can include other avdl files. But the generated 
> only one header file is generated and it does not include guard the 
> enums/structs which were defined in the other avdl files.
> I have some basic records which I've defined in there own avdl files, and 
> include them inside of more complicated structures.
> All is well, until I try to include multiple of the avro generated header 
> files.
> Inside of your AvrogencppTests you have gotten around this by giving each 
> generated type there own name space.
> As a test, I quickly modified the existing avrogencpp.cc to include an 
> optional name to CodeGen::guard.
> {code:none}
> std::string guard(const string& name="");
> [...]
> string CodeGen::guard(const string& name)
> {
> string h = name.empty() ? headerFile_ : name;
> makeCanonical(h, true);
> return h + "_" + lexical_cast(random_()) + "__H_";
> }
> {code}
> And then adding guards around each Enum, Record, Union, Traits, etc.
> Which works well enough. However... the guards do not include the namespace 
> names, so this change breaks your unit tests.
> As long as two higher level classes in the same namespace do not include the 
> same subclasses the generated header files can both be used, but if you have 
> a basic data type like an Point(x,y) which is used throughout the higher 
> level classes then they will both redefine Point(x,y)
> On the Java side, everything is okay. Point(x,y) and all of the classes which 
> include Point are in their own files inside of a package.
> Is there any common way around this problem?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-1203) Support for obtaining the number of bytes written

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. resolved AVRO-1203.
---
   Resolution: Workaround
Fix Version/s: (was: 1.9.0)

A simple workaround using flush and getting the number of bytes from output 
stream has been provided.

> Support for obtaining the number of bytes written
> -
>
> Key: AVRO-1203
> URL: https://issues.apache.org/jira/browse/AVRO-1203
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.7.2
>Reporter: Manuel Simoni
>Priority: Major
>
> I've used the Avro C++ library to implement Avro support for Node.js [1].
> In the course of this, I needed to extend the library so that it keeps
> track of the number of bytes written to a stream by an encoder.
> These changes mostly required making various headers public, and only
> small changes to implementation code.
> Here's a detailed list of changes:
> - Make the following headers public:
>   -- BinaryEncoder.hh
>   -- Symbol.hh
>   -- ValidatingCodec.hh
>   -- json/JsonDom.hh
>   -- json/JsonEncoder.hh
>   -- json/JsonIO.hh
> - Extend StreamWriter to keep track of number of bytes written 
> (bytesWritten_, getBytesWritten())
> - Expose BinaryEncoder's and JsonEncoder's StreamWriter via getStreamWriter() 
> method
> The complete changes can be viewed here on Github:
> https://github.com/manuel/avro-cpp/commit/f77c108a04fc9e39397eb2fae86b2710b64e2c8a
> The code is available in the "manuel/avro-cpp" Github Repo on the
> "upstream_submit" branch:
> https://github.com/manuel/avro-cpp/tree/upstream_submit
> We would love to have this functionality added to the Avro C++ library.
> [1] https://github.com/collectivemedia/node-avro



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1360) C++ Resolving decoder is not working when reader schema has more fields than writer schema

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1360.
-

Closing the issues that got resolved in earlier releases

> C++ Resolving decoder is not working when reader schema has more fields than 
> writer schema
> --
>
> Key: AVRO-1360
> URL: https://issues.apache.org/jira/browse/AVRO-1360
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.4
>Reporter: Ramana Suvarapu
>Assignee: Thiruvalluvan M. G.
>Priority: Major
> Attachments: AVRO-1360-2.patch, AVRO-1360-3.patch, AVRO-1360-4.patch, 
> AVRO-1360-5.patch, AVRO-1360.patch, AVRO-RD.patch, callstack.txt, model.avsc, 
> testreader, testreader-1, testreader.hh, testwriter, testwriter-1, 
> testwriter.hh
>
>
> When reader schema has more number of fields than writer schema, C++ 
> implementation of resolving decoder is throwing exception "throwing exception 
> "Don't know how to handle excess fields for reader.” with out checking 
> whether fields are optional or fields have default values.
> Attached are reader and writer schemas. Record in reader schema has 2 
> additional fields than writer schema. One field is required field but it has 
> default value and another one is optional field (union of null and string). 
> Since one has default value and another is optional both reader and writer 
> schemas are supposed to be compatible. 
>  
> {"name": "defaultField", "type": "string", "default": "DEFAULT", 
> "declared":"true"}, 
> {"name": "optionalField", "type": ["string", "null"],"declared":"true"},
>  
> main()
> {
>   avro::ValidSchema readerSchema = load("reader.json");
>   avro::ValidSchema writerSchema = load("writer.json");
>   avro::DecoderPtr d = avro::resolvingDecoder(writerSchema, 
> readerSchema,avro::binaryDecoder());
> }
>  
> But when I tried to create resolving decoder, I am getting "Don't know how to 
> handle excess fields for reader.” But Java implementation works.  
>  
> Can you please let us know if there are any other limitations with c++ 
> implementation of ResolvingDecoder? We are planning to use it in our project 
> and we want to make sure it works as per avro specification.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1189) cpp build requires cmake 2.8.4

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1189.
-

Closing the issues that got resolved in earlier releases

> cpp build requires cmake 2.8.4
> --
>
> Key: AVRO-1189
> URL: https://issues.apache.org/jira/browse/AVRO-1189
> Project: Apache Avro
>  Issue Type: Bug
>  Components: build, c++
>Affects Versions: 1.7.2
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: avro-1189.txt
>
>
> The cmake_minimum_required setting for the C++ build is 2.6, but when I try 
> to compile using cmake 2.8.2 I get an error because add_test doesn't 
> understand the WORKING_DIRECTORY property. It turns out this was added in 
> cmake 2.8.4. The CMakeLists.txt should either be updated to use some kind of 
> workaround, or bump the minimum requirement to 2.8.4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2186) avro::Exception string gives the incorrect message

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2186:
--
Fix Version/s: 1.9.0

> avro::Exception string gives the incorrect message
> --
>
> Key: AVRO-2186
> URL: https://issues.apache.org/jira/browse/AVRO-2186
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.8.2
> Environment: {code:java}
> Linux bream 2.6.39-200.24.1.el6uek.x86_64 #1 SMP Sat Jun 23 02:39:07 EDT 2012 
> x86_64 x86_64 x86_64 GNU/Linux
> {code}
>  
>Reporter: Anubhav Siddharth
>Assignee: Thiruvalluvan M. G.
>Priority: Critical
> Fix For: 1.9.0
>
>
> When a field specified in schema is missing and we call avro::encode on the 
> data. It throws an exception, which is as expected. But the issue is with the 
> error message. It comes the other way round.
> Schema:
> {code:java}
> {
>  "type" : "record",
>  "name" : "userInfo",
>  "fields" : [
> { "name" : "id", "type" : "int"},
> { "name" : "fullName", "type" : "string" }
>  ]
> }
> {code}
> The generic datum to be encoded has no "id" provided. There is an empty 
> GenericDatum for that field. Here is the excetion that is thrown:
> {code:java}
> avro::Exception caught: Invalid operation. Expected: Null got Int{code}
> The exception message should have been 
> avro::Exception caught: Invalid operation. Expected: {color:#FF}*Int got 
> Null*{color}
>  
> {color:#33}Please let us know if you need more information on this.{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-230) Create a shared schema test directory structure

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-230.


Closing the issues that got resolved in earlier releases

> Create a shared schema test directory structure
> ---
>
> Key: AVRO-230
> URL: https://issues.apache.org/jira/browse/AVRO-230
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: c, c++, java, python
>Reporter: Matt Massie
>Assignee: Matt Massie
>Priority: Major
>
> This is an example of my proposed directory structure:
> * invalid_schemas/
> ** broken.json
> ** wrong.json
> * valid_data/
> ** foo_test/
> *** schema.json
> *** json_data/
>  valid_json_test_data.json
>  more_valid_json_test_data.json
> *** binary_data/
>  valid_binary_test_data.bin
>  more_test_data.bin
> ** bar_test/
> *** schema.json
> *** json_data/
>  ...
> * invalid_data/
> ** baz_test/
> *** schema.json
> *** json_data/
>  ...
> *** binary_data/
>  ...
> This structure supports positive and negative tests for avro schemas, json 
> data and binary data.
> * The "invalid_schema" directory holds a number of invalid schemas that 
> should fail to parse.
> * The "valid_data" directory has a number of self-contained tests in separate 
> directories.  Each test directory is required to have a "schema.json" file 
> that valid avro schema.  The "json_data" and "binary_data" directories are 
> optional for each test.
> * The "invalid_data" directory has the same rules as the "valid_data" 
> directory.  The data files should fail during tests (negative testing).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1868) Increase precision digits of JSON encodeNumber(double)

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1868.
-

Closing the issues that got resolved in earlier releases

> Increase precision digits of JSON encodeNumber(double)
> --
>
> Key: AVRO-1868
> URL: https://issues.apache.org/jira/browse/AVRO-1868
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Reporter: Jirapat Jirasirikul
>Assignee: Thiruvalluvan M. G.
>Priority: Minor
>
> The JSON encodeNumber(double) function writes value to output string stream. 
> The default precision is 6 which is not sufficient for many applications for 
> encoding a double.
> std::setprecision(16) should be used to set max precision to 16 digits. Note 
> that doing this would not unnecessarily append zeros to force 16 digits.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-523) records with the same name as a member generate bad c++ code

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-523.


Closing the issues that got resolved in earlier releases

> records with the same name as a member generate bad c++ code
> 
>
> Key: AVRO-523
> URL: https://issues.apache.org/jira/browse/AVRO-523
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Reporter: John Plevyak
>Assignee: peter liu
>Priority: Major
>
> records with the same name as a member generate bad c++ code:
> {code}
> {
> "type" : "array",
> "name" : "optionals",
> "items" : [
>{ "name" : "l", "type" : "record", "fields" : [ { "name" : "l", 
> "type": "long"} ] },
>{ "name" : "r", "type" : "record", "fields" : [ { "name" : "r", 
> "type": "long"} ] }
> ]
> }
> {code}
> produces c++ code such that when it is compiled it produces:
> union2.h:42: error: field 'int64_t avrouser::l::l' with same name as class



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1958) The generated C++ code doesn't compile if there are several enums with identical item names.

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1958.
-

Closing the issues that got resolved in earlier releases

> The generated C++ code doesn't compile if there are several enums with 
> identical item names.
> 
>
> Key: AVRO-1958
> URL: https://issues.apache.org/jira/browse/AVRO-1958
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Reporter: Max Lysenko
>Priority: Major
> Attachments: DataSchema.avsc
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1935) C++ avro not reading snappy compressed file properly

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1935.
-

Closing the issues that got resolved in earlier releases

> C++ avro  not reading snappy compressed file properly
> -
>
> Key: AVRO-1935
> URL: https://issues.apache.org/jira/browse/AVRO-1935
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.8.0
> Environment: Linux
>Reporter: razi
>Priority: Major
>
> I am trying to parse a snappy compressed avro file and I am getting the 
> following error:
> Unknown Codec in data file: AVRO_CODEC_KEY = snappy
> Is there support for snappy compression in Avro c++ ? 
> Patch ?
> Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1674) Optional field does not work in avro-cpp

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1674.
-

Closing the issues that got resolved in earlier releases

> Optional field does not work in avro-cpp
> 
>
> Key: AVRO-1674
> URL: https://issues.apache.org/jira/browse/AVRO-1674
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.7
>Reporter: Charlie Quillard
>Priority: Major
>
> I have to implement an optional field in my avro schema and when i test this 
> one with avro-py that works and that does not works with avro-cpp.
> That is my schema (cpx.json) :
> {code}
> {
>   "type" : "record",
>   "name" : "example",
>   "fields" : [
> {
>   "name": "city",
>   "type": ["null", "string"],
>   "defaults":null
> }
>   ]
> }
> {code}
> That is my cpp code:
> {code}
> typedef std::pair Pair;
> int main(int ac, char **av)
> {
> std::ifstream ifs("cpx.json");
> avro::ValidSchema schema;
> avro::compileJsonSchema(ifs, schema);
> Pair p(schema, avro::GenericDatum());
> avro::GenericDatum  = p.second;
> Data = avro::GenericDatum(schema);
> avro::GenericRecord  = Data.value();
> sReord.setFieldAt(sReord.fieldIndex("city"), avro::GenericDatum("test"));
> avro::DataFileWriter dataFileWriter("test.bin", schema);
> dataFileWriter.write(p);
> dataFileWriter.close();
> }
> {code}
> This is my error when i transform my binary to json with avo-tools :
> {quote}
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2
>   at 
> org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:402)
>   at 
> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:290)
>   at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>   at 
> org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:267)
>   at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:155)
>   at 
> org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193)
>   at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183)
>   at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
>   at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
>   at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
>   at org.apache.avro.file.DataFileStream.next(DataFileStream.java:220)
>   at org.apache.avro.tool.DataFileReadTool.run(DataFileReadTool.java:77)
>   at org.apache.avro.tool.Main.run(Main.java:84)
>   at org.apache.avro.tool.Main.main(Main.java:73)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-522) the rule on unions containing identical types are not enforced

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-522.


Closing the issues that got resolved in earlier releases

> the rule on unions containing identical types are not enforced
> --
>
> Key: AVRO-522
> URL: https://issues.apache.org/jira/browse/AVRO-522
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Reporter: John Plevyak
>Assignee: peter liu
>Priority: Major
>
> {
> "type" : "array",
> "name" : "optionals",
> "items" : [ "long", "long" ]
> }
> is accepted despite being illegal by the combination of precompile and 
> gen-cppcode.py



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1181) compileJsonSchemaFromString(std::string) declared in Compiler.hh but not defined

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1181.
-

Closing issues that got resolved in earlier releases.

> compileJsonSchemaFromString(std::string) declared in Compiler.hh but not 
> defined
> 
>
> Key: AVRO-1181
> URL: https://issues.apache.org/jira/browse/AVRO-1181
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.2, 1.7.3, 1.7.4
> Environment: linux, mac os
>Reporter: Daniel Russel
>Assignee: Daniel Russel
>Priority: Major
> Attachments: AVRO-1181.patch
>
>
> compileJsonSchemaFromString(std::string) is declared in the header, but not 
> defined. Need to add
> +AVRO_DECL ValidSchema compileJsonSchemaFromString(const std::string& input)
> +{
> +  return compileJsonSchemaFromString(input.c_str());
> +}
> +
> to the Compiler.cc or remove the prototype.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1290) Handling NaN and positive and negative infinities in C++ Json

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1290.
-

Closing issues that got resolved in earlier releases.

> Handling NaN and positive and negative infinities in C++ Json
> -
>
> Key: AVRO-1290
> URL: https://issues.apache.org/jira/browse/AVRO-1290
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.3
>Reporter: Daniel Russel
>Assignee: Daniel Russel
>Priority: Major
> Attachments: AVRO-1290.patch, patch
>
>
> If you use the json encoder and pass it a double with value e.g. 
> std::numeric_limits::infinity() it happily writes the literal "inf". 
> However, the decoder chokes on that literal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1995) JSON Parser does not properly check current state

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1995.
-

Closing issues that got resolved in earlier releases.

> JSON Parser does not properly check current state
> -
>
> Key: AVRO-1995
> URL: https://issues.apache.org/jira/browse/AVRO-1995
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Reporter: Victor Mota
>Assignee: Victor Mota
>Priority: Minor
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> In lines 
> https://github.com/apache/avro/blob/master/lang/c%2B%2B/impl/json/JsonIO.cc#L79
>  and 
> https://github.com/apache/avro/blob/master/lang/c%2B%2B/impl/json/JsonIO.cc#L87,
>  the expression will always evaluate to True since stArrayN and stObjectN are 
> none-zero.
> Presumably, the author meant to do:
> if (curState == stArray0 || curState ==  stArrayN) {
> and 
> if (curState == stObject0 || curState ==  stObject0) {
> respectively.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1415) C++ binary encoder and decoder doesn't handle uninitialzed enums

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1415.
-

Closing issues that got resolved in earlier releases.

> C++ binary encoder and decoder doesn't handle uninitialzed enums
> 
>
> Key: AVRO-1415
> URL: https://issues.apache.org/jira/browse/AVRO-1415
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.4
>Reporter: Ramana Suvarapu
>Assignee: Ramana Suvarapu
>Priority: Major
> Attachments: AVRO-1415-2.patch, AVRO-1415.patch
>
>
> When enums are not properly initialized and when they get encoded / decoded, 
> C++ enum encoding and decoding traits don't check for uninitialed enums and 
> it encodes the wrong values. When Java or C# tries to decode them, they throw 
> out of boundary exceptions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1701) warning: comparison between 'const enum testgen_r::ExampleEnum' and 'const enum testgen::ExampleEnum'

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1701.
-

Closing issues that got resolved in earlier releases.

> warning: comparison between 'const enum testgen_r::ExampleEnum' and 'const 
> enum testgen::ExampleEnum'
> -
>
> Key: AVRO-1701
> URL: https://issues.apache.org/jira/browse/AVRO-1701
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.7
>Reporter: peter liu
>Assignee: peter liu
>Priority: Minor
> Attachments: AVRO-1701-A.patch, AVRO-1701.1.patch
>
>
> saw below warning while compiling with g++ 4.2 and g++ 4.8 on mac os and linux
> {quote}
> [ 85%] Building CXX object 
> CMakeFiles/AvrogencppTests.dir/test/AvrogencppTests.cc.o
> In file included from 
> /Users/liuyanbo/Downloads/boost_1_56_0/boost/test/impl/framework.ipp:29:0,
>  from 
> /Users/liuyanbo/Downloads/boost_1_56_0/boost/test/included/unit_test.hpp:20,
>  from 
> /Users/liuyanbo/Downloads/boost_1_56_0/boost/test/included/unit_test_framework.hpp:2,
>  from 
> /Users/liuyanbo/git/avro/lang/c++/test/AvrogencppTests.cc:33:
> /Users/liuyanbo/Downloads/boost_1_56_0/boost/test/test_tools.hpp: In 
> instantiation of 'boost::test_tools::predicate_result 
> boost::test_tools::tt_detail::equal_impl(const Left&, const Right&) [with 
> Left = testgen_r::ExampleEnum; Right = testgen::ExampleEnum]':
> /Users/liuyanbo/Downloads/boost_1_56_0/boost/test/test_tools.hpp:560:40:   
> required from 'boost::test_tools::predicate_result 
> boost::test_tools::tt_detail::equal_impl_frwd::call_impl(const Left&, const 
> Right&, mpl_::false_) const [with Left = testgen_r::ExampleEnum; Right = 
> testgen::ExampleEnum; mpl_::false_ = mpl_::bool_]'
> /Users/liuyanbo/Downloads/boost_1_56_0/boost/test/test_tools.hpp:575:56:   
> required from 'boost::test_tools::predicate_result 
> boost::test_tools::tt_detail::equal_impl_frwd::operator()(const Left&, const 
> Right&) const [with Left = testgen_r::ExampleEnum; Right = 
> testgen::ExampleEnum]'
> /Users/liuyanbo/Downloads/boost_1_56_0/boost/test/test_tools.hpp:523:1:   
> required from 'bool boost::test_tools::tt_detail::check_frwd(Pred, const 
> boost::unit_test::lazy_ostream&, boost::test_tools::const_string, 
> std::size_t, boost::test_tools::tt_detail::tool_level, 
> boost::test_tools::tt_detail::check_type, const Arg0&, const char*, const 
> Arg1&, const char*) [with Pred = 
> boost::test_tools::tt_detail::equal_impl_frwd; Arg0 = testgen_r::ExampleEnum; 
> Arg1 = testgen::ExampleEnum; boost::test_tools::const_string = 
> boost::unit_test::basic_cstring; std::size_t = long unsigned int]'
> /Users/liuyanbo/git/avro/lang/c++/test/AvrogencppTests.cc:124:5:   required 
> from 'void checkRecord(const T1&, const T2&) [with T1 = 
> testgen_r::RootRecord; T2 = testgen::RootRecord]'
> /Users/liuyanbo/git/avro/lang/c++/test/AvrogencppTests.cc:180:23:   required 
> from here
> /Users/liuyanbo/Downloads/boost_1_56_0/boost/test/test_tools.hpp:536:17: 
> warning: comparison between 'const enum testgen_r::ExampleEnum' and 'const 
> enum testgen::ExampleEnum' [-Wenum-compare]
>  return left == right;
>  ^
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1346) C++: schema parser cannot parse verbose primitive types

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1346.
-

Closing issues that got resolved in earlier releases.

> C++: schema parser cannot parse verbose primitive types
> ---
>
> Key: AVRO-1346
> URL: https://issues.apache.org/jira/browse/AVRO-1346
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.4
>Reporter: Skye Wanderman-Milne
>Assignee: Skye Wanderman-Milne
>Priority: Major
> Attachments: AVRO-1346.2.patch, AVRO-1346.3.patch, AVRO-1346.patch
>
>
> The Avro C++ library's schema parser currently throws an "Unknown additional 
> Json fields" exception if a primitive type is not represented as a string 
> literal. As per 
> http://avro.apache.org/docs/current/spec.html#schema_primitive, primitive 
> types can be defined as e.g. "\{type: string\}" or "string". Extra attributes 
> are allowed too, e.g. "\{"avro.java.string":"String","type":"string"\}" (from 
> the spec: "Attributes not defined in this document are permitted as metadata, 
> but must not affect the format of serialized data.").



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1898) C++ library cannot parse unions with default values

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1898.
-

Closing issues that got resolved in earlier releases.

> C++ library cannot parse unions with default values
> ---
>
> Key: AVRO-1898
> URL: https://issues.apache.org/jira/browse/AVRO-1898
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.8.1
>Reporter: Hua Zhang
>Assignee: Hua Zhang
>Priority: Major
>
> I have a Avro file generated by the Java library, and when I tried to read it 
> using the C++ library, I got:
> Unexpected type for default value: Expected 6, but found 1
> Which basically says the default value should be of type Object, but got type 
> Boolean. After some investigation, I found the exception is thrown when 
> parsing the default value of a union. Looks like the C++ library requires the 
> default value of a union to be an object. This is different from what the 
> Avro specification says, that the default value of a union should be of the 
> first type in the union.
> This field is causing the problem:
> {
> "name" : "fake_name",
> "type" : [ "boolean", "null" ],
> "default" : false
> }
> From the code the C++ library is expecting:
> {
> "name" : "is_track_mau30",
> "type" : [ "boolean", "null" ],
> "default" : {"boolean": false}
> }
> The related code is in impl/Compiler.cc, line 284. If I change it to the 
> following, the Avro file can be parsed:
> {code}
> GenericUnion result(n);
> result.selectBranch(0);
> result.datum() = makeGenericDatum(n->leafAt(0), e, st);
> return GenericDatum(n, result);
> {code}
> Why is the C++ library behaves different than the specification? Can/should 
> it be changed to follow the specification? It will be a breaking change 
> though.
> Thanks and I'll be happy to contribute this change if it's deemed appropriate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1172) Avro C++ Json Decoder: Double cannot be decoded

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1172.
-

Closing issues that got resolved in earlier releases.

> Avro C++ Json Decoder: Double cannot be decoded
> ---
>
> Key: AVRO-1172
> URL: https://issues.apache.org/jira/browse/AVRO-1172
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.1
> Environment: Built under msys and gcc-4.6.1 on a Windows7/64 bit 
> machine.
>Reporter: Sam Overend
>Assignee: Sam Overend
>Priority: Major
>  Labels: patch
> Attachments: AVRO-1172-2.patch, AVRO-1172.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Short version: Looks like the C++ version of AVRO-1099.
> Long version: When a non-decimal double is read from a json file, the parser 
> treats it as an long, not a double, and therefore throws an exception. 
> Two possible solutions: (1) The decoder should be able to convert longs to 
> doubles or acknowledge that a long is a type of double. 
> (2) The encoder should always output a double with a decimal point.
> Example code is included below. Output is:
> (1.01, 2.13)
> terminate called after throwing an instance of 'avro::Exception'
>   what():  Incorrect token in the stream. Expected: Double, found Integer
> After running complex.json is: {"re":1,"im":2.13
> #include 
> #include 
> using namespace std;
> #include "cpx.hh"
> #include "avro/Compiler.hh"
> #include "avro/Encoder.hh"
> #include "avro/Decoder.hh"
> avro::ValidSchema load(const char* filename)
> {
> std::ifstream ifs(filename);
> avro::ValidSchema result;
> avro::compileJsonSchema(ifs, result);
> return result;
> }
> void OutTest()
> {
> avro::ValidSchema cpxSchema = load("cpx_schema.json");
> std::auto_ptr out = 
> avro::fileOutputStream("complex.json",1);
> avro::EncoderPtr e = avro::jsonEncoder(cpxSchema);
> e->init(*out);
> c::cpx c1;
> c1.re = 1.01;
> c1.im = 2.13;
> avro::encode(*e, c1);
> out->flush();
> }
> void OutTest2()
> {
> avro::ValidSchema cpxSchema = load("cpx_schema.json");
> std::auto_ptr out = 
> avro::fileOutputStream("complex.json",1);
> avro::EncoderPtr e = avro::jsonEncoder(cpxSchema);
> e->init(*out);
> c::cpx c1;
> c1.re = 1.0;
> c1.im = 2.13;
> avro::encode(*e, c1);
> out->flush();
> }
> void InTest()
> {
> avro::ValidSchema cpxSchema = load("cpx_schema.json");
> std::auto_ptr in = 
> avro::fileInputStream("complex.json",1);
> avro::DecoderPtr d = avro::jsonDecoder(cpxSchema);
> d->init(*in);
> c::cpx c2;
> avro::decode(*d, c2);
> std::cout << '(' << c2.re << ", " << c2.im << ')' << std::endl;
> }
> int main()
> {
> OutTest();
> InTest();
> OutTest2();
> InTest();
> return 0;
> }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1474) C++ resolving decoder doesn't work when reader schema has more fields than writer schema

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1474.
-

Closing issues that got resolved in earlier releases.

> C++ resolving decoder doesn't work when reader schema has more fields than 
> writer schema
> 
>
> Key: AVRO-1474
> URL: https://issues.apache.org/jira/browse/AVRO-1474
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.6
>Reporter: Ramana Suvarapu
>Assignee: Thiruvalluvan M. G.
>Priority: Major
> Attachments: AVRO-1474-ENUM.patch, AVRO-1474-MAP.patch, 
> AVRO-1474-REUSE-RD.patch, AVRO-1474-ResolvingDecoder.patch, AVRO-1474.patch, 
> bigrecord, bigrecord_r, reader, writer
>
>
> When reader schema has more number of fields than writer schema, C++ 
> implementation of resolving decoder is throwing exception "throwing exception 
> "Don't know how to handle excess fields for reader.” with out checking 
> whether fields are optional or fields have default values.
> Attached are reader and writer schemas. Record in reader schema has 2 
> additional fields than writer schema. One field is required field but it has 
> default value and another one is optional field (union of null and string). 
> Since one has default value and another is optional both reader and writer 
> schemas are supposed to be compatible. 
>  
> {"name": "defaultField", "type": "string", "default": "DEFAULT", 
> "declared":"true"}, 
> {"name": "optionalField", "type": ["string", "null"],"declared":"true"},
>  
> main()
> {
>   avro::ValidSchema readerSchema = load("reader.json");
>   avro::ValidSchema writerSchema = load("writer.json");
>   avro::DecoderPtr d = avro::resolvingDecoder(writerSchema, 
> readerSchema,avro::binaryDecoder());
> }
>  
> But when I tried to create resolving decoder, I am getting "Don't know how to 
> handle excess fields for reader.” But Java implementation works.  
> Also field ordering is not working. 
> The same issue is reported in AVRO-1360.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1994) C++ Code Generator Generates Invalid Code if Field is of type Null

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1994.
-

Closing issues that got resolved in earlier releases.

> C++ Code Generator Generates Invalid Code if Field is of type Null
> --
>
> Key: AVRO-1994
> URL: https://issues.apache.org/jira/browse/AVRO-1994
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Reporter: Darryl Green
>Assignee: Darryl Green
>Priority: Major
> Attachments: AVRO-1994.patch, another-AVRO-1994.patch
>
>
> An simple schema like this:
> {
> "name": "TestPrimitiveTypes",
> "type": "record",
> "fields": [
> { "name": "Null", "type": "null" },
> { "name": "Boolean", "type": "boolean" },
> { "name": "Int", "type": "int" },
> { "name": "Long", "type": "long" },
> { "name": "Float", "type": "float" },
> { "name": "Double", "type": "double" },
> { "name": "Bytes", "type": "bytes" },
> { "name": "String", "type": "string" }
> ]
> }
> Generates this C++ struct.
> struct TestPrimitiveTypes {
> $Undefined$ Null; // <-- BUG!
> bool Boolean;
> int32_t Int;
> int64_t Long;
> float Float;
> double Double;
> std::vector Bytes;
> std::string String;
> TestPrimitiveTypes() :
> Null($Undefined$()),
> Boolean(bool()),
> Int(int32_t()),
> Long(int64_t()),
> Float(float()),
> Double(double()),
> Bytes(std::vector()),
> String(std::string())
> { }
> };
> Note the C++ type of the field Null is $Undefined$ which is obviously 
> invalid/won't compile. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AVRO-1930) JsonParser doesn't handle integer scientific notation

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. closed AVRO-1930.
-

Closing issues that got resolved in earlier releases.

> JsonParser doesn't handle integer scientific notation
> -
>
> Key: AVRO-1930
> URL: https://issues.apache.org/jira/browse/AVRO-1930
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.8.1
>Reporter: Pietro Cerutti
>Assignee: Pietro Cerutti
>Priority: Major
>  Labels: patch
> Attachments: AVRO-1930.patch
>
>
> The JsonParser doesn't handle, e.g., 1e12, which is valid according to the 
> JSON specification.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-1488) FileBufferCopyIn::seek does not work on Windows systems

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. resolved AVRO-1488.
---
Resolution: Fixed

Applied the patch. Thank you [~benmwebb].

> FileBufferCopyIn::seek does not work on Windows systems
> ---
>
> Key: AVRO-1488
> URL: https://issues.apache.org/jira/browse/AVRO-1488
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.6
> Environment: Win32
>Reporter: Ben Webb
>Assignee: Ben Webb
>Priority: Major
>  Labels: easyfix, patch
> Attachments: win32-seek.patch
>
>
> FileBufferCopyIn::seek() on a Windows system generally fails with an 
> exception. This is because it is implemented as
> if (::SetFilePointer(...)  != INVALID_SET_FILE_POINTER) throw(...)
> This test is the opposite of what it should be! SetFilePointer returns the 
> new pointer on success (sort of), just like lseek. See 
> http://msdn.microsoft.com/en-us/library/windows/desktop/aa365541%28v=vs.85%29.aspx
> The fix is pretty simple - reverse the test. But apparently 
> INVALID_SET_FILE_POINTER is also a valid file pointer (that's not confusing 
> at all - thanks Microsoft!) so you also need to check the Windows error 
> indicator to make absolutely sure. I'll attach a patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2222) Avro C++ documentation is missing code snippets

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-:
--
Fix Version/s: 1.9.0

> Avro C++ documentation is missing code snippets
> ---
>
> Key: AVRO-
> URL: https://issues.apache.org/jira/browse/AVRO-
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.8.0, 1.8.1, 1.8.2, 1.8.3, 1.8.4
>Reporter: Rui Maciel
>Assignee: Darryl Green
>Priority: Major
> Fix For: 1.9.0
>
> Attachments: 
> 0001-AVRO--Avro-C-documentation-is-missing-code-snipp.patch
>
>
> The Apache Avro C++ Documentation page[¹] includes an example on how to use 
> Avro with C++ but unfortunately most code snippets are missing, which renders 
> the document unusable.
>  
> [¹]http://avro.apache.org/docs/1.8.2/api/cpp/html/index.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2209) Default value type validation over-zealous

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2209:
--
Fix Version/s: 1.9.0

> Default value type validation over-zealous
> --
>
> Key: AVRO-2209
> URL: https://issues.apache.org/jira/browse/AVRO-2209
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.8.2
>Reporter: Darryl Green
>Assignee: Darryl Green
>Priority: Major
> Fix For: 1.9.0
>
>
> From the Avro specification re default values (and hence JSON encoding in 
> general):
>  
> |field default values|
> ||avro type||json type||example||
> |null|null|null|
> |boolean|boolean|true|
> |int,long|integer|1|
> |float,double|number|1.1|
> |bytes|string|"\u00FF"|
> |string|string|"foo"|
> |record|object|{"a": 1}|
> |enum|string|"FOO"|
> |array|array|[1]|
> |map|object|{"a": 1}|
> |fixed|string|"\u00ff"|
>  
> Note that float and double have a "json type" of number (while int, long have 
> a "json type" of integer. In JSON an integer is a number that is constrained 
> to be an integer. There is no way to deduce from a JSON value that has no 
> fractional part whether that value is a number or an integer - it is 
> either/both.
> I believe that the following schema is, on that basis, valid:
> "{ \"name\":\"test\", \"type\": \"record\", \"fields\": [
> {\"name\": \"double\",\"type\": \"double\",\"default\" : 2 }
> ]}",
>  We have a substantial body of similar schema in use but have not attempted 
> to use C++ to resolve them before - and now this is failing.
>  Fix is reasonably straight forward - PR with tests:
> [https://github.com/apache/avro/pull/326]
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1635) C++ schema aware encoders throw tr1::bad_weak_ptr exception for recursive schema

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1635:
--
Fix Version/s: 1.9.0

> C++ schema aware encoders throw tr1::bad_weak_ptr exception for recursive 
> schema
> 
>
> Key: AVRO-1635
> URL: https://issues.apache.org/jira/browse/AVRO-1635
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.7
> Environment: Ubuntu 12.04, gcc 4.6.4
>Reporter: Heye Vöcking
>Assignee: William Matthews
>Priority: Major
>  Labels: avro, c++, encode, recursive, schema
> Fix For: 1.9.0
>
>
> Encoding an object with a recursive schema fails when using a jsonEncoder or 
> a validatingEncoder. Here is an example:
> Output:
> {noformat}
> terminate called after throwing an instance of 
> 'boost::exception_detail::clone_impl
>  >'
>   what():  tr1::bad_weak_ptr
> {noformat}
> {code:title=container.json|borderStyle=solid}
> {
>   "name": "Container",
>   "doc": "Container to demonstrate the weak_ptr exception.",
>   "type": "record",
>   "fields": [{
> "name": "field",
> "type": {
>   "name": "Object",
>   "type": "record",
>   "fields": [{
> "name": "value",
> "type": [
>   "string",
>   {"type": "map", "values": "Object"}
> ]
>   }]
> }
>   }]
> }
> {code}
> {code:title=example.cc|borderStyle=solid}
> #include 
> #include 
> #include 
> #include 
> #include 
> #include "container.hh"
> int
> main()
> {
> std::ifstream ifs("container.json");
> avro::ValidSchema schema;
> avro::compileJsonSchema(ifs, schema);
> std::auto_ptr out = avro::memoryOutputStream();
> // Either one fails, here we use the jsonEncoder
> // avro::EncoderPtr encoder = avro::jsonEncoder(schema);
> avro::EncoderPtr encoder = avro::validatingEncoder(schema, 
> avro::binaryEncoder());
> 
> // An encoder that doesn't know the schema works fine
> // avro::EncoderPtr encoder = avro::binaryEncoder();
> encoder->init(*out);
> Container container;
> std::map object;
> // Note that it doesn't fail if we don't insert a value into the map
> object["a"].value.set_string("x");
> container.field.value.set_map(object);
> avro::encode(*encoder, container);
> return 0;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2292) C++ does not build on Mac

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2292:
--
Fix Version/s: 1.9.0

> C++ does not build on Mac
> -
>
> Key: AVRO-2292
> URL: https://issues.apache.org/jira/browse/AVRO-2292
> Project: Apache Avro
>  Issue Type: Task
>  Components: c++
>Reporter: Thiruvalluvan M. G.
>Assignee: Thiruvalluvan M. G.
>Priority: Major
> Fix For: 1.9.0
>
>
> The current master does not build on mac.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1335) C++ should support field default values

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1335:
--
Fix Version/s: 1.9.0

> C++ should support field default values
> ---
>
> Key: AVRO-1335
> URL: https://issues.apache.org/jira/browse/AVRO-1335
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.7.4
>Reporter: Bin Guo
>Assignee: Victor Mota
>Priority: Major
> Fix For: 1.9.0
>
> Attachments: AVRO-1335.patch
>
>
> We found that resolvingDecoder could not provide bidirectional compatibility 
> between different version of schemas.
> Especially for records, for example:
> {code:title=First schema}
> {
> "type": "record",
> "name": "TestRecord",
> "fields": [
> {
> "name": "MyData",
>   "type": {
>   "type": "record",
>   "name": "SubData",
>   "fields": [
>   {
>   "name": "Version1",
>   "type": "string"
>   }
>   ]
>   }
> },
>   {
> "name": "OtherData",
> "type": "string"
> }
> ]
> }
> {code}
> {code:title=Second schema}
> {
> "type": "record",
> "name": "TestRecord",
> "fields": [
> {
> "name": "MyData",
>   "type": {
>   "type": "record",
>   "name": "SubData",
>   "fields": [
>   {
>   "name": "Version1",
>   "type": "string"
>   },
>   {
>   "name": "Version2",
>   "type": "string"
>   }
>   ]
>   }
> },
>   {
> "name": "OtherData",
> "type": "string"
> }
> ]
> }
> {code}
> Say, node A knows only the first schema and node B knows the second schema, 
> and the second schema has more fields. 
> Any data generated by node B can be resolved by first schema 'cause the 
> additional field is marked as skipped.
> But data generated by node A can not be resolved by second schema and throws 
> an exception *"Don't know how to handle excess fields for reader."*
> This is because data is resolved exactly according to the auto-generated 
> codec_traits which trying to read the excess field.
> The problem is we just can not only ignore the excess field in record, since 
> the data after the troublesome record also needs to be resolved.
> Actually this problem stucked us for a very long time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2214) Support sync and seek in C++ DataFileReader

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2214:
--
Fix Version/s: 1.9.0

> Support sync and seek in C++ DataFileReader
> ---
>
> Key: AVRO-2214
> URL: https://issues.apache.org/jira/browse/AVRO-2214
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.8.2
>Reporter: William Matthews
>Assignee: William Matthews
>Priority: Minor
> Fix For: 1.9.0
>
>
> Java DataFileReader supports sync, seek, pastSync, etc. which allow parallel 
> reads of files, and reasonably efficient "tailing" of files. It would be 
> great if these were supported in C++ too.
> Also, I think this would serve as a bit of a workaround for 
> https://issues.apache.org/jira/browse/AVRO-2178 (stat a file & see if it has 
> grown, sync/seek, read, repeat).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2220) std::bad_alloc when String or Bytes field has a negative length

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2220:
--
Fix Version/s: 1.9.0

> std::bad_alloc when String or Bytes field has a negative length
> ---
>
> Key: AVRO-2220
> URL: https://issues.apache.org/jira/browse/AVRO-2220
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Reporter: Victor Mota
>Assignee: Victor Mota
>Priority: Major
> Fix For: 1.9.0
>
> Attachments: 
> poc-18e554fc65b937059584f21805da4b598f2266290f19d764da2c30ca1c829d0a (3)
>
>
> Attached is a sample file created by our Fuzzer running on the C++ library 
> that causes an std::bad_alloc due to the string or byte field having an 
> invalid negative integer length. The fix is trivial I'll send out a PR soon 
> but it's something like:
>  
> {code:java}
> void BinaryDecoder::decodeString(std::string& value)
> {
>  // Preserve the sign to avoid allocating memory if len is negative.
>  ssize_t len = decodeInt();
>  if (len < 0) {
>  throw Exception(
>  boost::format("Cannot have a string of negative length: %1%") % len);
>  }
>  value.resize(len);
>  if (len > 0) {
>  in_.readBytes(reinterpret_cast([0]), len);
>  }
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2213) C++ tests fail with boost 1.67+

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2213:
--
Fix Version/s: 1.9.0

> C++ tests fail with boost 1.67+
> ---
>
> Key: AVRO-2213
> URL: https://issues.apache.org/jira/browse/AVRO-2213
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.8.2
>Reporter: William Matthews
>Assignee: William Matthews
>Priority: Minor
> Fix For: 1.9.0
>
>
> Boost 1.67 and later now returns an error when multiple tests are added with 
> the same name 
> ([https://www.boost.org/doc/libs/1_67_0/libs/test/doc/html/boost_test/change_log.html).]
> This fails for 3 tests:
> CodecTests.cc:
> Test setup error: boost::unit_test::framework::setup_error: test unit with 
> name 'testCodecResolving2_0' registered 
> multiple times
> DataFileTests.cc:
> Test setup error: boost::unit_test::framework::setup_error: test unit with 
> name 'DataFileTest__testReadFull' registered multiple times
> unittest.cc:
> Test setup error: boost::unit_test::framework::setup_error: test unit with 
> name 'T__test' registered multiple times
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2014) C++ DataFile support custom stream

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2014:
--
Fix Version/s: 1.9.0

> C++ DataFile support custom stream
> --
>
> Key: AVRO-2014
> URL: https://issues.apache.org/jira/browse/AVRO-2014
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Reporter: Zoyo Pei
>Assignee: Thiruvalluvan M. G.
>Priority: Major
> Fix For: 1.9.0
>
>
> It is recommended that C++ DataFile support custom stream. E.g,
> DataFileWriter(OutputStream *stream, ...);
> So we can write into hdfs like this
> auto writer = new DataFileWriter(new HDFSOutputStream(...), ...);



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2010) multi dimensional array schema is not generated as expected

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2010:
--
Fix Version/s: 1.9.0

> multi dimensional array schema is not generated as expected
> ---
>
> Key: AVRO-2010
> URL: https://issues.apache.org/jira/browse/AVRO-2010
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
> Environment: Ubuntu 14.04, boost 1.62
>Reporter: Philip Henzler
>Assignee: Thiruvalluvan M. G.
>Priority: Major
> Fix For: 1.9.0
>
>
> I made the following change to the unittest.cc, adding a two dimensional 
> array to the buildSchema() test:
> {code}
> ArraySchema array = ArraySchema(DoubleSchema());
> const std::string s("myarray");
> record.addField(s, array);
> ArraySchema array2 = ArraySchema(ArraySchema(DoubleSchema()));
> const std::string s2("my2dimarray");
> record.addField(s2, array2);
> {code}
> It creates the following JSON schema output:
> {code}
> {
> "name": "myarray",
> "type": {
> "type": "array",
> "items": "double"
> }
> },
> {
> "name": "my2dimarray",
> "type": {
> "type": "array",
> "items": "double"
> }
> }
> {code}
> Even tough I would expect the following output for the two dimensional case:
> {code}
> {
> "name": "my2dimarray",
> "type": {
> "type": "array",
> "items": {
> "type": "array",
> "items": "double"
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1488) FileBufferCopyIn::seek does not work on Windows systems

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1488:
--
Fix Version/s: 1.9.0

> FileBufferCopyIn::seek does not work on Windows systems
> ---
>
> Key: AVRO-1488
> URL: https://issues.apache.org/jira/browse/AVRO-1488
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.6
> Environment: Win32
>Reporter: Ben Webb
>Assignee: Ben Webb
>Priority: Major
>  Labels: easyfix, patch
> Fix For: 1.9.0
>
> Attachments: win32-seek.patch
>
>
> FileBufferCopyIn::seek() on a Windows system generally fails with an 
> exception. This is because it is implemented as
> if (::SetFilePointer(...)  != INVALID_SET_FILE_POINTER) throw(...)
> This test is the opposite of what it should be! SetFilePointer returns the 
> new pointer on success (sort of), just like lseek. See 
> http://msdn.microsoft.com/en-us/library/windows/desktop/aa365541%28v=vs.85%29.aspx
> The fix is pretty simple - reverse the test. But apparently 
> INVALID_SET_FILE_POINTER is also a valid file pointer (that's not confusing 
> at all - thanks Microsoft!) so you also need to check the Windows error 
> indicator to make absolutely sure. I'll attach a patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AVRO-1488) FileBufferCopyIn::seek does not work on Windows systems

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. reassigned AVRO-1488:
-

Assignee: Ben Webb

> FileBufferCopyIn::seek does not work on Windows systems
> ---
>
> Key: AVRO-1488
> URL: https://issues.apache.org/jira/browse/AVRO-1488
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.6
> Environment: Win32
>Reporter: Ben Webb
>Assignee: Ben Webb
>Priority: Major
>  Labels: easyfix, patch
> Attachments: win32-seek.patch
>
>
> FileBufferCopyIn::seek() on a Windows system generally fails with an 
> exception. This is because it is implemented as
> if (::SetFilePointer(...)  != INVALID_SET_FILE_POINTER) throw(...)
> This test is the opposite of what it should be! SetFilePointer returns the 
> new pointer on success (sort of), just like lseek. See 
> http://msdn.microsoft.com/en-us/library/windows/desktop/aa365541%28v=vs.85%29.aspx
> The fix is pretty simple - reverse the test. But apparently 
> INVALID_SET_FILE_POINTER is also a valid file pointer (that's not confusing 
> at all - thanks Microsoft!) so you also need to check the Windows error 
> indicator to make absolutely sure. I'll attach a patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AVRO-2037) Use std::any where available

2018-12-28 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. reassigned AVRO-2037:
-

Assignee: Thiruvalluvan M. G.

> Use std::any where available
> 
>
> Key: AVRO-2037
> URL: https://issues.apache.org/jira/browse/AVRO-2037
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Reporter: Darryl Green
>Assignee: Thiruvalluvan M. G.
>Priority: Major
>
> The use of boost::any to hold union types causes a significant performance 
> hit especially for small types - in particular the when using 
> [null,primitive] for optional primitive type elements of a schema. Most 
> (all?) implementations of std::any include a small value optimisation that 
> avoids allocation overhead for scalars and other small types. Its a little 
> unfortunate that the performance of a C++ binding of a notionally high 
> performance serialization format performs so poorly in this case (note - I 
> had previously proposed using boost::variant which would address this problem 
> but would fail to support recursive types or truly huge numbers of distinct 
> types in a union). Obviously this requires C++ 17 but could fall back to 
> boost::any for older compilers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-2280) Calling DataFileWriter::flush() when there is no data to write can subsequently cause an exception when the file is read

2018-12-24 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. resolved AVRO-2280.
---
   Resolution: Fixed
 Assignee: Thiruvalluvan M. G.
Fix Version/s: 1.9.0

Merged the Pull Requet

> Calling DataFileWriter::flush() when there is no data to write can 
> subsequently cause an exception when the file is read
> 
>
> Key: AVRO-2280
> URL: https://issues.apache.org/jira/browse/AVRO-2280
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Reporter: Brian Walshe
>Assignee: Thiruvalluvan M. G.
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 1.9.0
>
>
> If you call flush() on a DataFileWriter object that has no data waiting to be 
> written, this will produce an empty block at the end of the file which will 
> cause an exception on the last call to DataFileReader::read(T& datum)
> h2. Example
> For example adding the following to the Data File unit tests will cause them 
> to break 
> [https://github.com/bwalshe/avro/blob/7c6a229b2fcbb0b88368e1503a58daef9f43ee64/lang/c%2B%2B/test/DataFileTests.cc#L192]
> h2. Possible Solution
> Altering DataFileWriter::sync() to check if there are objects to be written 
> before proceeding will get the code to pass the unit tests. e.g.: 
> [https://github.com/bwalshe/avro/blob/7c6a229b2fcbb0b88368e1503a58daef9f43ee64/lang/c%2B%2B/impl/DataFile.cc#L141
>  
> |https://github.com/bwalshe/avro/blob/7c6a229b2fcbb0b88368e1503a58daef9f43ee64/lang/c%2B%2B/impl/DataFile.cc#L141]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AVRO-2280) Calling DataFileWriter::flush() when there is no data to write can subsequently cause an exception when the file is read

2018-12-24 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728566#comment-16728566
 ] 

Thiruvalluvan M. G. edited comment on AVRO-2280 at 12/25/18 2:11 AM:
-

Merged the Pull Request


was (Author: thiru_mg):
Merged the Pull Requet

> Calling DataFileWriter::flush() when there is no data to write can 
> subsequently cause an exception when the file is read
> 
>
> Key: AVRO-2280
> URL: https://issues.apache.org/jira/browse/AVRO-2280
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Reporter: Brian Walshe
>Assignee: Thiruvalluvan M. G.
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 1.9.0
>
>
> If you call flush() on a DataFileWriter object that has no data waiting to be 
> written, this will produce an empty block at the end of the file which will 
> cause an exception on the last call to DataFileReader::read(T& datum)
> h2. Example
> For example adding the following to the Data File unit tests will cause them 
> to break 
> [https://github.com/bwalshe/avro/blob/7c6a229b2fcbb0b88368e1503a58daef9f43ee64/lang/c%2B%2B/test/DataFileTests.cc#L192]
> h2. Possible Solution
> Altering DataFileWriter::sync() to check if there are objects to be written 
> before proceeding will get the code to pass the unit tests. e.g.: 
> [https://github.com/bwalshe/avro/blob/7c6a229b2fcbb0b88368e1503a58daef9f43ee64/lang/c%2B%2B/impl/DataFile.cc#L141
>  
> |https://github.com/bwalshe/avro/blob/7c6a229b2fcbb0b88368e1503a58daef9f43ee64/lang/c%2B%2B/impl/DataFile.cc#L141]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-434) Implemented Object Container Writer for c++

2018-12-23 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. resolved AVRO-434.
--
Resolution: Not A Problem

We have since implemented a much more complete Data file reader and writer in 
C++ for all the standard codecs and hence this issue has become redundant.

> Implemented Object Container Writer for c++
> ---
>
> Key: AVRO-434
> URL: https://issues.apache.org/jira/browse/AVRO-434
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.4.0
> Environment: c++
>Reporter: Babak Salamat
>Assignee: Babak Salamat
>Priority: Major
> Attachments: objectfile.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code changes include classes for a null codec, object container, data 
> blocks and object container writer.
> The changes have not been tested and further testing is required. I am 
> submitting the changes because I may not be able to work on it further.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2280) Calling DataFileWriter::flush() when there is no data to write can subsequently cause an exception when the file is read

2018-12-23 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728001#comment-16728001
 ] 

Thiruvalluvan M. G. commented on AVRO-2280:
---

Thank you [~bwalshe] for identifying the problem and providing an example where 
it breaks.

Even through your suggested solution addresses the example problem, it does not 
address a more general issue. The trouble is with data file blocks with zero 
objects (call them empty blocks) in them. [Avro 
specification|https://avro.apache.org/docs/1.8.1/spec.html] allows for empty 
blocks. Thus, even if the C++ implementation does not generate empty blocks, 
other language bindings may generate them. So the C++ reader should be able to 
handle them.

I've made a [pull request|http://https://github.com/apache/avro/pull/414] to 
address the reader's problem.

> Calling DataFileWriter::flush() when there is no data to write can 
> subsequently cause an exception when the file is read
> 
>
> Key: AVRO-2280
> URL: https://issues.apache.org/jira/browse/AVRO-2280
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Reporter: Brian Walshe
>Priority: Major
>  Labels: newbie, pull-request-available
>
> If you call flush() on a DataFileWriter object that has no data waiting to be 
> written, this will produce an empty block at the end of the file which will 
> cause an exception on the last call to DataFileReader::read(T& datum)
> h2. Example
> For example adding the following to the Data File unit tests will cause them 
> to break 
> [https://github.com/bwalshe/avro/blob/7c6a229b2fcbb0b88368e1503a58daef9f43ee64/lang/c%2B%2B/test/DataFileTests.cc#L192]
> h2. Possible Solution
> Altering DataFileWriter::sync() to check if there are objects to be written 
> before proceeding will get the code to pass the unit tests. e.g.: 
> [https://github.com/bwalshe/avro/blob/7c6a229b2fcbb0b88368e1503a58daef9f43ee64/lang/c%2B%2B/impl/DataFile.cc#L141
>  
> |https://github.com/bwalshe/avro/blob/7c6a229b2fcbb0b88368e1503a58daef9f43ee64/lang/c%2B%2B/impl/DataFile.cc#L141]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1541) Specific.hh is over-specialized for standard C++ containers

2018-12-23 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727994#comment-16727994
 ] 

Thiruvalluvan M. G. commented on AVRO-1541:
---

Good observation.

{{Specifc.hh}} kind of fixes C++ types for various Avro types. Apart from 
obvious mappings for {{bool}}, {{int}}, {{long}} etc, it fixes {{std::string}} 
for Avro string, \{{std::vector}} for Avro bytes and 
{{boost::array}} for Avro fixed. For composite types - Avro array 
and map - it uses {{std::vector}} and {{std::map}}. The user 
has no control over the choice of these C++ types when generating code using 
{{avrogencpp}} tool. It will be an interesting project to generalize this to 
allow the user to specify a compatible C++ while generating the code.

Now {{codec_traits}} exists to enable easy compile-time support for the 
generated code and since the C++ types corresponding to Avro types is fixed, 
the specializations for {{codec_traits}} are sufficient. But as 
[~wgs_smiddleditch] points out, these specializations are too narrow and if 
widened, they can be used for other purposes even if we do not allow the user 
to choose C++ types during code generation.

The above observation is true for {{Generic.hh}} as well, except that for Avro 
record and other types, it does not have specific generated classes but rather 
uses {{GenericRecord}} and so on. But given the nature of use of 
{{Generic.hh}}, customizability will not be of great use and hence no further 
work is needed.

> Specific.hh is over-specialized for standard C++ containers
> ---
>
> Key: AVRO-1541
> URL: https://issues.apache.org/jira/browse/AVRO-1541
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.6
>Reporter: Sean Middleditch
>Priority: Trivial
>
> The encoders in Specific.hh for the C++ stdlib types like string, vector, 
> etc. are over-specialized and take only specific variations of these 
> templated templates. The specializations of codec_traits should be partial 
> specializations to support std::string, std::vector, std::map, and so on 
> using user-specific allocators and (for std:set and std::map) a user-specific 
> comparator, as Avro has absolutely no reason to care about these details.
> These partial specializations will not require any API incompatible breaks.  
> They'd look something like:
> template 
> template <>
> struct codec_traits > {
>   void encode(Encoder& e, const std::vector& s) {
> // ... 
>   }
>   void encode(Encoder& d, std::vector& s) {
> // ... 
>   }
> };
> std::string could be the trickier one since it should probably be a partial 
> specialization over the supported variants of basic_string, though partial 
> specialization of std::string, std::u16string, std::u32string, and 
> std::wstring (which can all have custom allocators).
> Don't forget that std::set and std::map can have both a custom allocator and 
> a custom comparator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1203) Support for obtaining the number of bytes written

2018-12-23 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727985#comment-16727985
 ] 

Thiruvalluvan M. G. commented on AVRO-1203:
---

This exposes too much of implementation. This will make it hard to modify the 
implementation in the future. As [~pjacobs] suggests, it should be possible to 
flush and check the bytes written (`OutputStream::byteCount()` exists exactly 
for that purpose).

> Support for obtaining the number of bytes written
> -
>
> Key: AVRO-1203
> URL: https://issues.apache.org/jira/browse/AVRO-1203
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.7.2
>Reporter: Manuel Simoni
>Priority: Major
> Fix For: 1.9.0
>
>
> I've used the Avro C++ library to implement Avro support for Node.js [1].
> In the course of this, I needed to extend the library so that it keeps
> track of the number of bytes written to a stream by an encoder.
> These changes mostly required making various headers public, and only
> small changes to implementation code.
> Here's a detailed list of changes:
> - Make the following headers public:
>   -- BinaryEncoder.hh
>   -- Symbol.hh
>   -- ValidatingCodec.hh
>   -- json/JsonDom.hh
>   -- json/JsonEncoder.hh
>   -- json/JsonIO.hh
> - Extend StreamWriter to keep track of number of bytes written 
> (bytesWritten_, getBytesWritten())
> - Expose BinaryEncoder's and JsonEncoder's StreamWriter via getStreamWriter() 
> method
> The complete changes can be viewed here on Github:
> https://github.com/manuel/avro-cpp/commit/f77c108a04fc9e39397eb2fae86b2710b64e2c8a
> The code is available in the "manuel/avro-cpp" Github Repo on the
> "upstream_submit" branch:
> https://github.com/manuel/avro-cpp/tree/upstream_submit
> We would love to have this functionality added to the Avro C++ library.
> [1] https://github.com/collectivemedia/node-avro



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AVRO-1350) Error in decoding enums using ResolvingDecoder

2018-12-23 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. reassigned AVRO-1350:
-

Assignee: Thiruvalluvan M. G.

> Error in decoding enums using ResolvingDecoder
> --
>
> Key: AVRO-1350
> URL: https://issues.apache.org/jira/browse/AVRO-1350
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.4
>Reporter: Bin Guo
>Assignee: Thiruvalluvan M. G.
>Priority: Major
>
> We can't get a correct result when decoding enums using resolving decoder.  
> e.g.
> {code:title=schema}
> {
>   "type" : "record",
>   "name" : "TestEnum",
>   "fields" : [
>   {
>   "name" : "MyMode",
>   "type" : {
> "type" : "enum",
> "name" : "Mode",
> "symbols" : [ "MEMORY", "DISK" ]
>   }
>   }
>   ]
> }
> {code}
> We encoded "DISK"(1), then decoded with resolving decoder, got "MEMORY"(0).
> I examined the code and found that there is a *sort* after reading names of 
> reader.
> I could't quite understand the author's intention, but it really can not work 
> well.
> When decoding my enum, the return value is actually the *position of the 
> sorted names*, and obviously it's not correct.
> {code:title=Symbol.cc}
> Symbol Symbol::enumAdjustSymbol(const NodePtr& writer, const NodePtr& reader)
> {
> vector rs;
> size_t rc = reader->names();
> for (size_t i = 0; i < rc; ++i) {
> rs.push_back(reader->nameAt(i));
> }
> sort(rs.begin(), rs.end()); // the strange sort
> {code}
> Here is my complete test case.
> {code:title=generated structure}
> enum Mode {
> MEMORY,
> DISK,
> };
> struct TestEnum {
> Mode MyMode;
> };
> {code}
> {code:title=My test case}
> #include "ts_enum.h"
> #include "avro/Compiler.hh"
> #include "avro/ValidSchema.hh"
> using namespace std;
> using namespace avro;
> using namespace enum_test;
> static const char ts_schema_string[] =
> "{ \"type\" : \"record\", \"name\" : \"TestEnum\", \"fields\" : "
> "[ { \"name\" : \"MyMode\", \"type\" : "
> "{ \"type\" : \"enum\", \"name\" : \"Mode\", "
> "\"symbols\" : [ \"MEMORY\", \"DISK\" ] } } ]}";
> int main(int argc, char * argv[]) {
> TestEnum te1, te2;
> ValidSchema reader = compileJsonSchemaFromString(ts_schema_string);
> ValidSchema writer = compileJsonSchemaFromString(ts_schema_string);
> //encode TestEnum
> auto_ptr out_stream = memoryOutputStream();
> EncoderPtr encoder = binaryEncoder();
> encoder->init(*out_stream);
> te1.MyMode = DISK;
> encode(*encoder, te1);
> encoder->flush();
> //decode TestEnum
> auto_ptr in_stream = memoryInputStream(*out_stream);
> DecoderPtr decoder = resolvingDecoder(writer, reader, 
> avro::binaryDecoder());
> decoder->init(*in_stream);
> decode(*decoder, te2);
> cout<<"TE1: "< return 0;
> }
> {code}
> The result
> ---
> TE1: 1 | TE2: 0
> I debuged into avro code. 
> In Symbol::enumAdjustSymbol, there is a vector of reader's enum 
> names, and after the sort, "MEMOEY, DISK" turned to be "DISK, MEMORY". 
> At last, a vector of writer's enum names saved every position of the 
> sorted vector. 
> As a result, in the returned symbol, MEMORY's position is 1 and DISK's 
> position is 0. 
> Finally, when we decoding the enum, the *position* is returned to the target 
> object.
> I could't quite understand the author's intention here but when I commented 
> the sort, everything worked well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-2292) C++ does not build on Mac

2018-12-23 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. resolved AVRO-2292.
---
Resolution: Fixed

Merged the pull request https://github.com/apache/avro/pull/412

> C++ does not build on Mac
> -
>
> Key: AVRO-2292
> URL: https://issues.apache.org/jira/browse/AVRO-2292
> Project: Apache Avro
>  Issue Type: Task
>  Components: c++
>Reporter: Thiruvalluvan M. G.
>Assignee: Thiruvalluvan M. G.
>Priority: Major
>
> The current master does not build on mac.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AVRO-1777) Select best matching record when writing a union in python

2018-12-22 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727508#comment-16727508
 ] 

Thiruvalluvan M. G. edited comment on AVRO-1777 at 12/22/18 5:55 PM:
-

Reopening the issue because the fix breaks interop tests.

 

Here is a [pull request|https://github.com/apache/avro/pull/413] that reverts 
the change. We can merge it back once the interop problem is fixed.


was (Author: thiru_mg):
Reopening the issue because the fix breaks interop tests.

 

Here is a [pull request|https://github.com/apache/avro/pull/413] that reverts 
the change. We can merge it back once the interop problem is fixed.

[link title|http://example.com]

> Select best matching record when writing a union in python
> --
>
> Key: AVRO-1777
> URL: https://issues.apache.org/jira/browse/AVRO-1777
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: python
>Affects Versions: 1.7.7
>Reporter: Steven Aerts
>Priority: Major
> Fix For: 1.9.0
>
>
> Unlike javascript, python is not using wrapped types.
> So when writing a union it needs to guess find out which type it will output.
> At the moment it takes the last validating type.
> I propose to take the type with the most matching fields.
> So I propose to change in {{io.py}}:
> {code}
> # resolve union
> index_of_schema = -1
> for i, candidate_schema in enumerate(writers_schema.schemas):
>   if validate(candidate_schema, datum):
> index_of_schema = i
> if index_of_schema < 0: raise AvroTypeException(writers_schema, datum)
> {code}
> into
> {code}
> # resolve union
> index_of_schema = -1
> found_fields = -1
> for i, candidate_schema in enumerate(writers_schema.schemas):
>   if validate(candidate_schema, datum):
> nr_fields = candidate_schema.type in ['record', 'error', 'request'] and 
> len(candidate_schema.fields) or 1
> if nr_fields > found_fields:
>   index_of_schema = i
>   found_fields = nr_fields
> if index_of_schema < 0: raise AvroTypeException(writers_schema, datum)
> {code}
> If you want, I can create a pull request for this.  And apply it both on py3 
> as py.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (AVRO-1777) Select best matching record when writing a union in python

2018-12-22 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. reopened AVRO-1777:
---
  Assignee: (was: Daniel Kulp)

Reopening the issue because the fix breaks interop tests.

 

Here is a [pull request|https://github.com/apache/avro/pull/413] that reverts 
the change. We can merge it back once the interop problem is fixed.

[link title|http://example.com]

> Select best matching record when writing a union in python
> --
>
> Key: AVRO-1777
> URL: https://issues.apache.org/jira/browse/AVRO-1777
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: python
>Affects Versions: 1.7.7
>Reporter: Steven Aerts
>Priority: Major
> Fix For: 1.9.0
>
>
> Unlike javascript, python is not using wrapped types.
> So when writing a union it needs to guess find out which type it will output.
> At the moment it takes the last validating type.
> I propose to take the type with the most matching fields.
> So I propose to change in {{io.py}}:
> {code}
> # resolve union
> index_of_schema = -1
> for i, candidate_schema in enumerate(writers_schema.schemas):
>   if validate(candidate_schema, datum):
> index_of_schema = i
> if index_of_schema < 0: raise AvroTypeException(writers_schema, datum)
> {code}
> into
> {code}
> # resolve union
> index_of_schema = -1
> found_fields = -1
> for i, candidate_schema in enumerate(writers_schema.schemas):
>   if validate(candidate_schema, datum):
> nr_fields = candidate_schema.type in ['record', 'error', 'request'] and 
> len(candidate_schema.fields) or 1
> if nr_fields > found_fields:
>   index_of_schema = i
>   found_fields = nr_fields
> if index_of_schema < 0: raise AvroTypeException(writers_schema, datum)
> {code}
> If you want, I can create a pull request for this.  And apply it both on py3 
> as py.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AVRO-2292) C++ does not build on Mac

2018-12-21 Thread Thiruvalluvan M. G. (JIRA)
Thiruvalluvan M. G. created AVRO-2292:
-

 Summary: C++ does not build on Mac
 Key: AVRO-2292
 URL: https://issues.apache.org/jira/browse/AVRO-2292
 Project: Apache Avro
  Issue Type: Task
  Components: c++
Reporter: Thiruvalluvan M. G.
Assignee: Thiruvalluvan M. G.


The current master does not build on mac.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-2274) Improve resolving performance when schemas don't change

2018-11-27 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. resolved AVRO-2274.
---
Resolution: Fixed

Merged the PR. Thank you [~raymie].

> Improve resolving performance when schemas don't change
> ---
>
> Key: AVRO-2274
> URL: https://issues.apache.org/jira/browse/AVRO-2274
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Raymie Stata
>Assignee: Raymie Stata
>Priority: Major
>
> Decoding optimizations based on the observation that schemas don't change 
> very much.  We add special-case paths to optimize the case where a 
> _sub_schema of the reader and the writer are the same.  The specific cases 
> are:
> * In the case of an enumeration, if the reader and writer are the same, then 
> we can simply return the tag written by the writer rather than "adjust" it as 
> if it might have been re-ordered.  In fact, we can do this (directly return 
> the tag written by the writer) as long as the reader-schema is an "extension" 
> of the writer's in that it may have added new symbols but hasn't renumbered 
> any of the writer's symbols.  Enumerations that either don't change at all or 
> are "extended" as defined here are the common ways to extend enumerations.  
> (Our tests show this optimization improves performance by about 3%.)
> * When the reader and writer subschemas are both unions, resolution is 
> expensive: we have an outer union preceded by a "writer-union action", but 
> each branch of this outer union consist of union-adjust actions, which are 
> heavy weight.  We optimize this case when the reader and writer unions are 
> the same: we fall back on the standard grammar used for a union, avoiding all 
> these adjustments.  Since unions are commonly used to encode "nullable" 
> fields in Avro, and nullability rarely changes as a schema evolves, this 
> optimization should help many users.  (Our tests show this optimization 
> improves performance by 25-30%, a significant win.)
> * The "custom code" generated for reading records has to read fields in a 
> loop that uses a switch statement to deal with writers that may have 
> re-ordered fields.  In most cases, however, fields have not been reordered 
> (esp. in more complex records with many record sub-schemas).  So we've added 
> a new method to ResolvingDecoder called readFieldOrderIfDiff, which is a 
> variant of the existing readFieldOrder.  If the field order has indeed 
> changed, then readFieldOrderIfDiff returns the new field order, just like 
> readFieldOrder does.  However, if the field-order hasn't changed, then 
> readFieldOrderIfDiff returns null.  We then modified the generation of 
> custom-decoders for records to add a special-case path that simply reads the 
> record's fields in order, without incurring the overhead of the loop or the 
> switch statement.  (Our tests show this optimization improves performance by 
> 8-9%, on top of the 35-40% produced by the original custom-coder 
> optimization.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2272) SchemaParseException: Can't redefine: list in AvroIndexedRecordConverter

2018-11-25 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698169#comment-16698169
 ] 

Thiruvalluvan M. G. commented on AVRO-2272:
---

Looked at it and 
[commented|https://issues.apache.org/jira/browse/PARQUET-1441?focusedCommentId=16698168=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16698168]
 on PARQUET-1441.

> SchemaParseException: Can't redefine: list in AvroIndexedRecordConverter
> 
>
> Key: AVRO-2272
> URL: https://issues.apache.org/jira/browse/AVRO-2272
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Michael Heuer
>Priority: Major
>
> Companion issue to https://issues.apache.org/jira/browse/PARQUET-1441, and 
> https://issues.apache.org/jira/browse/SPARK-25588, since those issues in 
> downstream projects don't seem to be getting any notice.
> I've been able to create unit tests that reproduce the issue downstream in 
> Spark and Parquet; I would appreciate any help reproducing the issue in the 
> Avro codebase directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2272) SchemaParseException: Can't redefine: list in AvroIndexedRecordConverter

2018-11-24 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698058#comment-16698058
 ] 

Thiruvalluvan M. G. commented on AVRO-2272:
---

Let me take a stab at it.

> SchemaParseException: Can't redefine: list in AvroIndexedRecordConverter
> 
>
> Key: AVRO-2272
> URL: https://issues.apache.org/jira/browse/AVRO-2272
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Michael Heuer
>Priority: Major
>
> Companion issue to https://issues.apache.org/jira/browse/PARQUET-1441, and 
> https://issues.apache.org/jira/browse/SPARK-25588, since those issues in 
> downstream projects don't seem to be getting any notice.
> I've been able to create unit tests that reproduce the issue downstream in 
> Spark and Parquet; I would appreciate any help reproducing the issue in the 
> Avro codebase directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AVRO-2273) Release 1.8.3

2018-11-24 Thread Thiruvalluvan M. G. (JIRA)
Thiruvalluvan M. G. created AVRO-2273:
-

 Summary: Release 1.8.3
 Key: AVRO-2273
 URL: https://issues.apache.org/jira/browse/AVRO-2273
 Project: Apache Avro
  Issue Type: Task
Reporter: Thiruvalluvan M. G.


This ticket is for releasing Avro 1.8.3 and discussing any topics related to it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2273) Release 1.8.3

2018-11-24 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2273:
--
Fix Version/s: 1.8.3

> Release 1.8.3
> -
>
> Key: AVRO-2273
> URL: https://issues.apache.org/jira/browse/AVRO-2273
> Project: Apache Avro
>  Issue Type: Task
>Reporter: Thiruvalluvan M. G.
>Priority: Major
> Fix For: 1.8.3
>
>
> This ticket is for releasing Avro 1.8.3 and discussing any topics related to 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-1173) C++ API for dynamic reading/writing based on schema

2018-11-23 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. resolved AVRO-1173.
---
Resolution: Not A Problem

> C++ API for dynamic reading/writing based on schema
> ---
>
> Key: AVRO-1173
> URL: https://issues.apache.org/jira/browse/AVRO-1173
> Project: Apache Avro
>  Issue Type: Wish
>  Components: c++
>Reporter: Stefan Langer
>Priority: Major
>
> When I started looking at Avro I hoped it would offer some API to read values 
> by name/id (or at least get name/id of datum while iterating over all 
> entries).
> When looking at examples for C: 
> http://avro.apache.org/docs/1.6.3/api/c/index.html#_examples
> ... or some Java examples
> There are getters/setters which have name-arguments, and there are 
> Record-objects constructed from schema which help reading/writing data.
> While testing the C++ API, I couldn't find a way to do so with it!
> I'm still not sure if I'm missing some part of the API or if it is just not 
> yet part of the C++ Interface.
> About C API: I could not use it, because it is C99 focused, so it can't be 
> compiled on our VS2008 ... For the C++ API it's just some tiny tweaks to get 
> it running.
> About Generator: I'm not interested in generating code (if I would be there 
> are enough alternatives to Avro ...)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1173) C++ API for dynamic reading/writing based on schema

2018-11-18 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691248#comment-16691248
 ] 

Thiruvalluvan M. G. commented on AVRO-1173:
---

[~sl.sy.ifm] Really sorry to come back after six year. The functionality you 
want is present, the usage is slightly different:
{code}
const GenericDatum& value = ...;
record.field("fieldname") = value;
{code}

> C++ API for dynamic reading/writing based on schema
> ---
>
> Key: AVRO-1173
> URL: https://issues.apache.org/jira/browse/AVRO-1173
> Project: Apache Avro
>  Issue Type: Wish
>  Components: c++
>Reporter: Stefan Langer
>Priority: Major
>
> When I started looking at Avro I hoped it would offer some API to read values 
> by name/id (or at least get name/id of datum while iterating over all 
> entries).
> When looking at examples for C: 
> http://avro.apache.org/docs/1.6.3/api/c/index.html#_examples
> ... or some Java examples
> There are getters/setters which have name-arguments, and there are 
> Record-objects constructed from schema which help reading/writing data.
> While testing the C++ API, I couldn't find a way to do so with it!
> I'm still not sure if I'm missing some part of the API or if it is just not 
> yet part of the C++ Interface.
> About C API: I could not use it, because it is C99 focused, so it can't be 
> compiled on our VS2008 ... For the C++ API it's just some tiny tweaks to get 
> it running.
> About Generator: I'm not interested in generating code (if I would be there 
> are enough alternatives to Avro ...)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-1636) C++ JsonDecoder expects json object to be ordered

2018-11-18 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. resolved AVRO-1636.
---
Resolution: Won't Fix

The issue is not specific to C++ implementation and rare. As mentioned earlier, 
the could would read JSON produced by Avro. If the JSON comes from elsewhere, 
it might fail. Since JSON format exists primarily for debugging purpose and not 
for storing production data, I'm resolving this issue as "won't fix".

> C++ JsonDecoder expects json object to be ordered
> -
>
> Key: AVRO-1636
> URL: https://issues.apache.org/jira/browse/AVRO-1636
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.7
>Reporter: Mann Du
>Priority: Major
>
> I am using  Shafquat Rahman's original post for this problem reported in Avro 
> user mailing list in last May for the description - ( Thiru provided a fix 
> for the exact problem for Java in Oct. 2011 with Avro-895.)
> I have been experimenting with avro in C++ (version 1.7.5) and ran into an 
> issue with the json decoder which expects ordered json objects. The problem I 
> am seeing appears similar to this post I found for an older avro java library:
> http://search-hadoop.com/m/7WG37aVaBd/v=plain
> I have a simple record:
> {
> "name" : "SimpleRecord",
> "type" : "record",
> "fields" :[ 
> { "name" : "A", "type" : "int"},
> { "name" : "B", "type" : "int"}
> ]
> }
> I generate the C++ header using avrogencpp. The generated  code has 
> codec_traits specialization for SimpleRecord that fixes the order for the 
> JsonEncoder and JsonDecoder.
> ...snip...
> namespace avro {
> template<> struct codec_traits {
> static void encode(Encoder& e, const SimpleRecord& v) {
> avro::encode(e, v.A);
> avro::encode(e, v.B);
> }
> static void decode(Decoder& d, SimpleRecord& v) {
> avro::decode(d, v.A);
> avro::decode(d, v.B);
> }
> };
> ...snip...
> The JsonDecoder successfully decodes json objects of the form{"A" : 1, "B" : 
> 2} into SimpleRecord. But if I try to decode {"B" : 2, "A" : 1} it throws 
> 'avro::Exception' with "Incorrect field" from impl/parsing/JsonCodec.cc:182 
> in the following method:
> JsonDecoderHandler(JsonParser& p) : in_(p) { }
> size_t handle(const Symbol& s) {
> switch (s.kind()) {
> case Symbol::sRecordStart:
> expectToken(in_, JsonParser::tkObjectStart);
> break;
> case Symbol::sRecordEnd:
> expectToken(in_, JsonParser::tkObjectEnd);
> break;
> case Symbol::sField:
> expectToken(in_, JsonParser::tkString);
> if (s.extra() != in_.stringValue()) {
> throw Exception("Incorrect field");
> }
> break;
> default:
> break;
> }
> return 0;
> }
> The stack shows that avro::decode(d, v.A) is  the call the eventually causes 
> the exception.
> According to the json spec the fields in a json object are unordered. ...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1182) DataFileReader missing seek, sync methods

2018-11-18 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1182:
--
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

> DataFileReader missing seek, sync methods
> -
>
> Key: AVRO-1182
> URL: https://issues.apache.org/jira/browse/AVRO-1182
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.7.3
>Reporter: Daniel Russel
>Priority: Major
> Attachments: with_partial_sync_test
>
>
> The DataFileReader is missing the seek and sync methods that are found in the 
> java version making it hard to navigate a file except in a linear fashion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-1383) Unabe to build on Visual Studio 2003 when generated file is huge

2018-11-18 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. resolved AVRO-1383.
---
Resolution: Won't Fix

Won't fix since the development platform is really old.

> Unabe to build on Visual Studio 2003 when generated file is huge
> 
>
> Key: AVRO-1383
> URL: https://issues.apache.org/jira/browse/AVRO-1383
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.4
> Environment: Windows VS 2003
>Reporter: Ramana Suvarapu
>Priority: Major
>
> Hi,
> This is related to AVRO-1370. Currently C++ code generation produces single 
> file and with lot of inline functions.  If the schema file is huge, it's 
> generating huge header file. When this header file  is used to to build the 
> project, we are getting "object file format limit exceeded : more than 65,279 
> sections". To fix this problem we had to use /bigobj flag to the project  and 
> this fixed the problem.
> Unfortunately /bigobj is only supported from VS 2005. Prior versions of VS 
> 2005 don't have this flag. 
> Is it possible to split the generated file into multiple parts by class name 
> and it's avro traits code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AVRO-1133) Build failing with Visual Studio C++ 2008 due to missing stdint.h

2018-11-18 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. resolved AVRO-1133.
---
Resolution: Won't Fix

Won't fix since the development platform is very old.

> Build failing with Visual Studio C++ 2008 due to missing stdint.h
> -
>
> Key: AVRO-1133
> URL: https://issues.apache.org/jira/browse/AVRO-1133
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.1
> Environment: Windows XP Professional 32-bit SP3, Microsoft Visual 
> Studio 2008 SP1
>Reporter: Laurent Moss
>Priority: Major
>
> Several Avro C++ API files refer to stdint.h. However, this file is not 
> available on Microsoft Visual Studio 2008 (and previous versions). This 
> results in several build errors such as:
> C:\workspace\avro-cpp\api\Validator.hh(24) : fatal error C1083: Cannot open 
> include file: 'stdint.h': No such file or directory
> This is similar to an issue previously faced by the Avro C API:
> https://issues.apache.org/jira/browse/AVRO-551
> This was issue was fixed in the Avro C API by integrating open-source ISO C9x 
> compliant stdint.h and inttypes.h files for Microsoft Visual Studio:
> https://code.google.com/p/msinttypes/
> An alternative for the Avro C++ API would be to replace references to 
> stdint.h by references to Boost's cstdint.hpp
> http://www.boost.org/doc/libs/1_50_0/boost/cstdint.hpp



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2244) Problems with TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148

2018-11-18 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691194#comment-16691194
 ] 

Thiruvalluvan M. G. commented on AVRO-2244:
---

The fix for the problem is 
[here|https://github.com/thiru-apache/avro/commit/da91a1908a5a2febd644c981d8cf0b6823199f2a]

> Problems with 
> TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148
> ---
>
> Key: AVRO-2244
> URL: https://issues.apache.org/jira/browse/AVRO-2244
> Project: Apache Avro
>  Issue Type: Bug
>  Components: logical types
>Reporter: Raymie Stata
>Assignee: Thiruvalluvan M. G.
>Priority: Major
>
> I've seen an intermittent test failure that looks like this:
> {{Failed tests:}}
> {{  
> TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148}}
> {{Expected: is "20:35:18.720"}}
> {{ but: was "20:35:18.72"}}
> When I see this failure, it's always the case that the trailing digit is 
> zero.  I suspect that it's a bug where the trailing zero is not printed.  
> Since the test cases use the current time, then most of the time the trailing 
> digit isn't zero and the bug isn't tickled.  But once-in-a-while the current 
> time has a trailing zero, which tickles the bug.
> If this diagnosis is correct, then in addition to fixing the bug, it might be 
> a good idea to add tests with hard-wired, static times that cover corner 
> cases like this one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AVRO-2244) Problems with TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148

2018-11-18 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691177#comment-16691177
 ] 

Thiruvalluvan M. G. edited comment on AVRO-2244 at 11/19/18 2:46 AM:
-

The solution for AVRO-2241 reduces the problem frequency from 1 in 10 to 1 in 
1000.
I showed how to deterministically reproduce the problem 
[here|https://github.com/thiru-apache/avro/commit/7c4160553f4337c0ded0b40ac2833cebf24e2a07]


was (Author: thiru_mg):
I showed how to deterministically reproduce the problem 
[here|https://github.com/thiru-apache/avro/commit/7c4160553f4337c0ded0b40ac2833cebf24e2a07]

> Problems with 
> TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148
> ---
>
> Key: AVRO-2244
> URL: https://issues.apache.org/jira/browse/AVRO-2244
> Project: Apache Avro
>  Issue Type: Bug
>  Components: logical types
>Reporter: Raymie Stata
>Assignee: Thiruvalluvan M. G.
>Priority: Major
>
> I've seen an intermittent test failure that looks like this:
> {{Failed tests:}}
> {{  
> TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148}}
> {{Expected: is "20:35:18.720"}}
> {{ but: was "20:35:18.72"}}
> When I see this failure, it's always the case that the trailing digit is 
> zero.  I suspect that it's a bug where the trailing zero is not printed.  
> Since the test cases use the current time, then most of the time the trailing 
> digit isn't zero and the bug isn't tickled.  But once-in-a-while the current 
> time has a trailing zero, which tickles the bug.
> If this diagnosis is correct, then in addition to fixing the bug, it might be 
> a good idea to add tests with hard-wired, static times that cover corner 
> cases like this one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (AVRO-2244) Problems with TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148

2018-11-18 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2244:
--
Comment: was deleted

(was: The fix for the problem is 
[here|https://github.com/thiru-apache/avro/commit/da91a1908a5a2febd644c981d8cf0b6823199f2a])

> Problems with 
> TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148
> ---
>
> Key: AVRO-2244
> URL: https://issues.apache.org/jira/browse/AVRO-2244
> Project: Apache Avro
>  Issue Type: Bug
>  Components: logical types
>Reporter: Raymie Stata
>Assignee: Thiruvalluvan M. G.
>Priority: Major
>
> I've seen an intermittent test failure that looks like this:
> {{Failed tests:}}
> {{  
> TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148}}
> {{Expected: is "20:35:18.720"}}
> {{ but: was "20:35:18.72"}}
> When I see this failure, it's always the case that the trailing digit is 
> zero.  I suspect that it's a bug where the trailing zero is not printed.  
> Since the test cases use the current time, then most of the time the trailing 
> digit isn't zero and the bug isn't tickled.  But once-in-a-while the current 
> time has a trailing zero, which tickles the bug.
> If this diagnosis is correct, then in addition to fixing the bug, it might be 
> a good idea to add tests with hard-wired, static times that cover corner 
> cases like this one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AVRO-2244) Problems with TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148

2018-11-18 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691177#comment-16691177
 ] 

Thiruvalluvan M. G. edited comment on AVRO-2244 at 11/19/18 2:23 AM:
-

I showed how to deterministically reproduce the problem 
[here|https://github.com/thiru-apache/avro/commit/7c4160553f4337c0ded0b40ac2833cebf24e2a07]


was (Author: thiru_mg):
I showed how to deterministically reproduce the problem 
[here|https://github.com/thiru-apache/avro/commit/97fb7c28a886384482ecafe7dad0281c10357bc4]

> Problems with 
> TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148
> ---
>
> Key: AVRO-2244
> URL: https://issues.apache.org/jira/browse/AVRO-2244
> Project: Apache Avro
>  Issue Type: Bug
>  Components: logical types
>Reporter: Raymie Stata
>Assignee: Thiruvalluvan M. G.
>Priority: Major
>
> I've seen an intermittent test failure that looks like this:
> {{Failed tests:}}
> {{  
> TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148}}
> {{Expected: is "20:35:18.720"}}
> {{ but: was "20:35:18.72"}}
> When I see this failure, it's always the case that the trailing digit is 
> zero.  I suspect that it's a bug where the trailing zero is not printed.  
> Since the test cases use the current time, then most of the time the trailing 
> digit isn't zero and the bug isn't tickled.  But once-in-a-while the current 
> time has a trailing zero, which tickles the bug.
> If this diagnosis is correct, then in addition to fixing the bug, it might be 
> a good idea to add tests with hard-wired, static times that cover corner 
> cases like this one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2244) Problems with TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148

2018-11-18 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691177#comment-16691177
 ] 

Thiruvalluvan M. G. commented on AVRO-2244:
---

I showed how to deterministically reproduce the problem 
[here|https://github.com/thiru-apache/avro/commit/97fb7c28a886384482ecafe7dad0281c10357bc4]

> Problems with 
> TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148
> ---
>
> Key: AVRO-2244
> URL: https://issues.apache.org/jira/browse/AVRO-2244
> Project: Apache Avro
>  Issue Type: Bug
>  Components: logical types
>Reporter: Raymie Stata
>Assignee: Thiruvalluvan M. G.
>Priority: Major
>
> I've seen an intermittent test failure that looks like this:
> {{Failed tests:}}
> {{  
> TestSpecificLogicalTypes.testAbilityToReadJsr310RecordWrittenAsJodaRecord:148}}
> {{Expected: is "20:35:18.720"}}
> {{ but: was "20:35:18.72"}}
> When I see this failure, it's always the case that the trailing digit is 
> zero.  I suspect that it's a bug where the trailing zero is not printed.  
> Since the test cases use the current time, then most of the time the trailing 
> digit isn't zero and the bug isn't tickled.  But once-in-a-while the current 
> time has a trailing zero, which tickles the bug.
> If this diagnosis is correct, then in addition to fixing the bug, it might be 
> a good idea to add tests with hard-wired, static times that cover corner 
> cases like this one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2268) Perf.java SpecificRecord input data not working

2018-11-17 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2268:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged the PR. Thank you Raymie.

> Perf.java SpecificRecord input data not working
> ---
>
> Key: AVRO-2268
> URL: https://issues.apache.org/jira/browse/AVRO-2268
> Project: Apache Avro
>  Issue Type: Test
>  Components: java
>Reporter: Raymie Stata
>Assignee: Raymie Stata
>Priority: Major
>
> In {{FooBarSpecificRecordTest.genSingleRecord}}, the {{nicknames}} field is 
> given an instance of what is returned by {{ArrayList.asList}}, which does 
> _not_ support the {{clear}} method.  When reusing objects during a read, the 
> {{clear}} method is used to clear the contents of array-valued fields during 
> reading, which causes an {{OperationNotSupported}} exception.  So 
> {{genSingleRecord}} needs to change to set {{nicknames}} to a type that 
> implements {{clear}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2268) Perf.java SpecificRecord input data not working

2018-11-17 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2268:
--
Status: Patch Available  (was: Open)

> Perf.java SpecificRecord input data not working
> ---
>
> Key: AVRO-2268
> URL: https://issues.apache.org/jira/browse/AVRO-2268
> Project: Apache Avro
>  Issue Type: Test
>  Components: java
>Reporter: Raymie Stata
>Assignee: Raymie Stata
>Priority: Major
>
> In {{FooBarSpecificRecordTest.genSingleRecord}}, the {{nicknames}} field is 
> given an instance of what is returned by {{ArrayList.asList}}, which does 
> _not_ support the {{clear}} method.  When reusing objects during a read, the 
> {{clear}} method is used to clear the contents of array-valued fields during 
> reading, which causes an {{OperationNotSupported}} exception.  So 
> {{genSingleRecord}} needs to change to set {{nicknames}} to a type that 
> implements {{clear}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2266) Avoid m2e plugin warning

2018-11-15 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2266:
--
Status: Patch Available  (was: Open)

> Avoid m2e plugin warning
> 
>
> Key: AVRO-2266
> URL: https://issues.apache.org/jira/browse/AVRO-2266
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Thiruvalluvan M. G.
>Assignee: Thiruvalluvan M. G.
>Priority: Minor
> Fix For: 1.9.0, 1.8.3
>
>
> When you build Java binding with maven you get
> {code:java}
> [WARNING] The POM for org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 is missing, 
> no dependency information available
> [WARNING] Failed to retrieve plugin descriptor for 
> org.eclipse.m2e:lifecycle-mapping:1.0.0: Plugin 
> org.eclipse.m2e:lifecycle-mapping:1.0.0 or one of its dependencies could not 
> be resolved: Failure to find org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced{code}
> Even though the warning is not harmful, one has to constantly remind oneself 
> that it is indeed harmless. It will be nice if we can get rid of the warning.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AVRO-2267) Duplicate code RandomData.java and its dependency problem

2018-11-15 Thread Thiruvalluvan M. G. (JIRA)
Thiruvalluvan M. G. created AVRO-2267:
-

 Summary: Duplicate code RandomData.java and its dependency problem
 Key: AVRO-2267
 URL: https://issues.apache.org/jira/browse/AVRO-2267
 Project: Apache Avro
  Issue Type: Improvement
  Components: java
Reporter: Thiruvalluvan M. G.
Assignee: Thiruvalluvan M. G.


There are two issues with {{RandomData}} class:
 * There are almost identical copies of the same code in two modules: {{avro}} 
and {{avro-ipc}}. We should use a single source file.
 * Both the copies belong to {{test}} subfolders in their respective module. 
But {{avro-tools}} module uses this class in {{main}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AVRO-2266) Avoid m2e plugin warning

2018-11-15 Thread Thiruvalluvan M. G. (JIRA)
Thiruvalluvan M. G. created AVRO-2266:
-

 Summary: Avoid m2e plugin warning
 Key: AVRO-2266
 URL: https://issues.apache.org/jira/browse/AVRO-2266
 Project: Apache Avro
  Issue Type: Improvement
  Components: java
Affects Versions: 1.8.2
Reporter: Thiruvalluvan M. G.
Assignee: Thiruvalluvan M. G.
 Fix For: 1.9.0, 1.8.3


When you build Java binding with maven you get
{code:java}
[WARNING] The POM for org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 is missing, 
no dependency information available

[WARNING] Failed to retrieve plugin descriptor for 
org.eclipse.m2e:lifecycle-mapping:1.0.0: Plugin 
org.eclipse.m2e:lifecycle-mapping:1.0.0 or one of its dependencies could not be 
resolved: Failure to find org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 in 
https://repo.maven.apache.org/maven2 was cached in the local repository, 
resolution will not be reattempted until the update interval of central has 
elapsed or updates are forced{code}
Even though the warning is not harmful, one has to constantly remind oneself 
that it is indeed harmless. It will be nice if we can get rid of the warning.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1702) Add LogicalType support to c++ library

2018-11-14 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1702:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

PR merged. Thanks a ton [~aniket486]

> Add LogicalType support to c++ library
> --
>
> Key: AVRO-1702
> URL: https://issues.apache.org/jira/browse/AVRO-1702
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: c++
>Reporter: peter liu
>Assignee: Aniket Mokashi
>Priority: Major
>
> I'd like to port the logicaltype support to c++ library



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AVRO-1702) Add LogicalType support to c++ library

2018-11-13 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. reassigned AVRO-1702:
-

Assignee: Aniket Mokashi  (was: peter liu)

> Add LogicalType support to c++ library
> --
>
> Key: AVRO-1702
> URL: https://issues.apache.org/jira/browse/AVRO-1702
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: c++
>Reporter: peter liu
>Assignee: Aniket Mokashi
>Priority: Major
>
> I'd like to port the logicaltype support to c++ library



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-2178) avro C++ api support of tail reading of a growing avro file

2018-11-12 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684629#comment-16684629
 ] 

Thiruvalluvan M. G. commented on AVRO-2178:
---

The problem is not due to our C++ implementation, this will happen on any 
implementation. Will close the issue as 'not a problem' unless someone objects 
to it.

> avro C++ api support of tail reading of a growing avro file
> ---
>
> Key: AVRO-2178
> URL: https://issues.apache.org/jira/browse/AVRO-2178
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.8.2
>Reporter: peien
>Priority: Major
>
> Two processes, one is writing to an avro data file, another wishes to read 
> the latest written data.
> The problem with current C++ API is that when it reaches the EOF, an 
> exception will be thrown, and from the user perspective, I have no way to 
> retry or 'tail read' it again from the last good position.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1383) Unabe to build on Visual Studio 2003 when generated file is huge

2018-11-12 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684626#comment-16684626
 ] 

Thiruvalluvan M. G. commented on AVRO-1383:
---

VS 2003 is fifteen years old. Do we need to address this issue? Is there any 
objection if we close this issue as 'WONTFIX'?

> Unabe to build on Visual Studio 2003 when generated file is huge
> 
>
> Key: AVRO-1383
> URL: https://issues.apache.org/jira/browse/AVRO-1383
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.4
> Environment: Windows VS 2003
>Reporter: Ramana Suvarapu
>Priority: Major
>
> Hi,
> This is related to AVRO-1370. Currently C++ code generation produces single 
> file and with lot of inline functions.  If the schema file is huge, it's 
> generating huge header file. When this header file  is used to to build the 
> project, we are getting "object file format limit exceeded : more than 65,279 
> sections". To fix this problem we had to use /bigobj flag to the project  and 
> this fixed the problem.
> Unfortunately /bigobj is only supported from VS 2005. Prior versions of VS 
> 2005 don't have this flag. 
> Is it possible to split the generated file into multiple parts by class name 
> and it's avro traits code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1133) Build failing with Visual Studio C++ 2008 due to missing stdint.h

2018-11-12 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684627#comment-16684627
 ] 

Thiruvalluvan M. G. commented on AVRO-1133:
---

VS 2008 is ten years old. Do we need to address this issue? Is there any 
objection if we close this issue as 'WONTFIX'?

> Build failing with Visual Studio C++ 2008 due to missing stdint.h
> -
>
> Key: AVRO-1133
> URL: https://issues.apache.org/jira/browse/AVRO-1133
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.1
> Environment: Windows XP Professional 32-bit SP3, Microsoft Visual 
> Studio 2008 SP1
>Reporter: Laurent Moss
>Priority: Major
>
> Several Avro C++ API files refer to stdint.h. However, this file is not 
> available on Microsoft Visual Studio 2008 (and previous versions). This 
> results in several build errors such as:
> C:\workspace\avro-cpp\api\Validator.hh(24) : fatal error C1083: Cannot open 
> include file: 'stdint.h': No such file or directory
> This is similar to an issue previously faced by the Avro C API:
> https://issues.apache.org/jira/browse/AVRO-551
> This was issue was fixed in the Avro C API by integrating open-source ISO C9x 
> compliant stdint.h and inttypes.h files for Microsoft Visual Studio:
> https://code.google.com/p/msinttypes/
> An alternative for the Avro C++ API would be to replace references to 
> stdint.h by references to Boost's cstdint.hpp
> http://www.boost.org/doc/libs/1_50_0/boost/cstdint.hpp



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2010) multi dimensional array schema is not generated as expected

2018-11-12 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2010:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pull request merged

> multi dimensional array schema is not generated as expected
> ---
>
> Key: AVRO-2010
> URL: https://issues.apache.org/jira/browse/AVRO-2010
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
> Environment: Ubuntu 14.04, boost 1.62
>Reporter: Philip Henzler
>Assignee: Thiruvalluvan M. G.
>Priority: Major
>
> I made the following change to the unittest.cc, adding a two dimensional 
> array to the buildSchema() test:
> {code}
> ArraySchema array = ArraySchema(DoubleSchema());
> const std::string s("myarray");
> record.addField(s, array);
> ArraySchema array2 = ArraySchema(ArraySchema(DoubleSchema()));
> const std::string s2("my2dimarray");
> record.addField(s2, array2);
> {code}
> It creates the following JSON schema output:
> {code}
> {
> "name": "myarray",
> "type": {
> "type": "array",
> "items": "double"
> }
> },
> {
> "name": "my2dimarray",
> "type": {
> "type": "array",
> "items": "double"
> }
> }
> {code}
> Even tough I would expect the following output for the two dimensional case:
> {code}
> {
> "name": "my2dimarray",
> "type": {
> "type": "array",
> "items": {
> "type": "array",
> "items": "double"
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1132) Build failing on MSYS/MinGW due to missing struct iovec

2018-11-12 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1132:
--
   Resolution: Fixed
Fix Version/s: 1.8.4
   Status: Resolved  (was: Patch Available)

> Build failing on MSYS/MinGW due to missing struct iovec
> ---
>
> Key: AVRO-1132
> URL: https://issues.apache.org/jira/browse/AVRO-1132
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.1
> Environment: Windows XP Professional 32-bit SP3, MSYS, MinGW GCC 4.5.1
>Reporter: Laurent Moss
>Assignee: Laurent Moss
>Priority: Major
>  Labels: build
> Fix For: 1.8.4
>
> Attachments: AVRO-1132.diff
>
>
> Avro C++ fails to build on MSYS with MinGW GCC due to references to 
> undeclared struct iovec:
> In file included from C:/workspace/avro-cpp/api/buffer/BufferReader.hh:22:0,
>  from C:/workspace/avro-cpp/api/Reader.hh:30,
>  from C:/workspace/avro-cpp/api/ResolverSchema.hh:28,
>  from c:/workspace/avro-cpp/impl/ResolverSchema.cc:20:
> C:/workspace/avro-cpp/api/buffer/Buffer.hh: In function 'void 
> avro::toIovec(BufferType&, std::vector&)':
> C:/workspace/avro-cpp/api/buffer/Buffer.hh:517:15: error: invalid use of 
> incomplete type 'struct avro::iovec'
> C:/workspace/avro-cpp/api/buffer/Buffer.hh:511:57: error: forward declaration 
> of 'struct avro::iovec'
> C:/workspace/avro-cpp/api/buffer/Buffer.hh:518:15: error: invalid use of 
> incomplete type 'struct avro::iovec'
> C:/workspace/avro-cpp/api/buffer/Buffer.hh:511:57: error: forward declaration 
> of 'struct avro::iovec'
> make[2]: *** [CMakeFiles/avrocpp_s.dir/impl/ResolverSchema.cc.obj] Error 1
> make[1]: *** [CMakeFiles/avrocpp_s.dir/all] Error 2
> make: *** [all] Error 2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1520) seek error when using compression and schema evolution

2018-11-12 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684163#comment-16684163
 ] 

Thiruvalluvan M. G. commented on AVRO-1520:
---

The patch might cause performance problem when skipping by large interval 
because it keeps reading off the underlying stream. A test case demonstrating 
the problem will greatly help to appreciate the problem.

> seek error when using compression and schema evolution
> --
>
> Key: AVRO-1520
> URL: https://issues.apache.org/jira/browse/AVRO-1520
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.6, 1.7.7
> Environment: All
>Reporter: Steve Roehrs
>Priority: Major
>  Labels: patch
> Fix For: 1.7.6
>
> Attachments: AVRO-1520.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When using compression and schema evolution, FileStream sometimes attempts to 
> seek on a IStreamBuffer, which fails. This make compression unusable for our 
> application. A fix has been developed, patch will be included shortly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1182) DataFileReader missing seek, sync methods

2018-11-12 Thread Thiruvalluvan M. G. (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684137#comment-16684137
 ] 

Thiruvalluvan M. G. commented on AVRO-1182:
---

AVRO-2214 has addressed this problem. Can we mark this closed?

> DataFileReader missing seek, sync methods
> -
>
> Key: AVRO-1182
> URL: https://issues.apache.org/jira/browse/AVRO-1182
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c++
>Affects Versions: 1.7.3
>Reporter: Daniel Russel
>Priority: Major
> Attachments: with_partial_sync_test
>
>
> The DataFileReader is missing the seek and sync methods that are found in the 
> java version making it hard to navigate a file except in a linear fashion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2010) multi dimensional array schema is not generated as expected

2018-11-12 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2010:
--
Assignee: Thiruvalluvan M. G.
  Status: Patch Available  (was: Open)

> multi dimensional array schema is not generated as expected
> ---
>
> Key: AVRO-2010
> URL: https://issues.apache.org/jira/browse/AVRO-2010
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
> Environment: Ubuntu 14.04, boost 1.62
>Reporter: Philip Henzler
>Assignee: Thiruvalluvan M. G.
>Priority: Major
>
> I made the following change to the unittest.cc, adding a two dimensional 
> array to the buildSchema() test:
> {code}
> ArraySchema array = ArraySchema(DoubleSchema());
> const std::string s("myarray");
> record.addField(s, array);
> ArraySchema array2 = ArraySchema(ArraySchema(DoubleSchema()));
> const std::string s2("my2dimarray");
> record.addField(s2, array2);
> {code}
> It creates the following JSON schema output:
> {code}
> {
> "name": "myarray",
> "type": {
> "type": "array",
> "items": "double"
> }
> },
> {
> "name": "my2dimarray",
> "type": {
> "type": "array",
> "items": "double"
> }
> }
> {code}
> Even tough I would expect the following output for the two dimensional case:
> {code}
> {
> "name": "my2dimarray",
> "type": {
> "type": "array",
> "items": {
> "type": "array",
> "items": "double"
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


<    1   2   3   4   5   6   7   >