[jira] [Created] (AVRO-1614) Always getting a value...

2014-11-27 Thread Niels Basjes (JIRA)
Niels Basjes created AVRO-1614:
--

 Summary: Always getting a value...
 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes


Sometimes the Avro structure becomes deeply nested.
If in such a scenario you want to be able to set a specific value deep in this 
tree you want to do this:

public void setSomething(String value) {
myStruct
.getFoo()
.getBar()
.getOne()
.getOther()
.setSomething(value);
}

The 'problem' I ran into is that any of the 4 get methods can return a null 
value so the code I have to write is really huge.
For every step in this method I have to build null checks and create the 
underlying instance if it is null.
I already started writing helper methods to do this for parts of my tree.

To solve this in a way that makes this code readable I came up with the 
following which I want to propose to you guys (before I start working on a 
patch).

My idea is to generate a new 'get' method in addition to the existing normal 
get method for the regular instance of the class.

So in addition to the 

public Foo getFoo() {
return foo;
}

I propose to generate something like this as well in the cases where this is a 
type of structure that you may want to traverse as shown in the example.

public Foo getAlwaysFoo() {
if (foo == null) {
setFoo(Foo.newBuilder().build());
}
return foo;
}

This way the automatically created instance immediately has all the defaults I 
have defined.

Assuming this naming my code will be readable because it will look like this:
public void setSomething(String value) {
myStruct
.getAlwaysFoo()
.getAlwaysBar()
.getAlwaysOne()
.getAlwaysOther()
.setSomething(value);
}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-11-27 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Description: 
Sometimes the Avro structure becomes deeply nested.
If in such a scenario you want to be able to set a specific value deep in this 
tree you want to do this:

{code}
public void setSomething(String value) {
myStruct
.getFoo()
.getBar()
.getOne()
.getOther()
.setSomething(value);
}
{code}
The 'problem' I ran into is that any of the 4 get methods can return a null 
value so the code I have to write is really huge.
For every step in this method I have to build null checks and create the 
underlying instance if it is null.
I already started writing helper methods to do this for parts of my tree.

To solve this in a way that makes this code readable I came up with the 
following which I want to propose to you guys (before I start working on a 
patch).

My idea is to generate a new 'get' method in addition to the existing normal 
get method for the regular instance of the class.

So in addition to the 
{code}
public Foo getFoo() {
return foo;
}
{code}
I propose to generate something like this as well in the cases where this is a 
type of structure that you may want to traverse as shown in the example.
{code}
public Foo getAlwaysFoo() {
if (foo == null) {
setFoo(Foo.newBuilder().build());
}
return foo;
}
{code}
This way the automatically created instance immediately has all the defaults I 
have defined.

Assuming this naming my code will be readable because it will look like this:

{code}
public void setSomething(String value) {
myStruct
.getAlwaysFoo()
.getAlwaysBar()
.getAlwaysOne()
.getAlwaysOther()
.setSomething(value);
}
{code}


  was:
Sometimes the Avro structure becomes deeply nested.
If in such a scenario you want to be able to set a specific value deep in this 
tree you want to do this:

public void setSomething(String value) {
myStruct
.getFoo()
.getBar()
.getOne()
.getOther()
.setSomething(value);
}

The 'problem' I ran into is that any of the 4 get methods can return a null 
value so the code I have to write is really huge.
For every step in this method I have to build null checks and create the 
underlying instance if it is null.
I already started writing helper methods to do this for parts of my tree.

To solve this in a way that makes this code readable I came up with the 
following which I want to propose to you guys (before I start working on a 
patch).

My idea is to generate a new 'get' method in addition to the existing normal 
get method for the regular instance of the class.

So in addition to the 

public Foo getFoo() {
return foo;
}

I propose to generate something like this as well in the cases where this is a 
type of structure that you may want to traverse as shown in the example.

public Foo getAlwaysFoo() {
if (foo == null) {
setFoo(Foo.newBuilder().build());
}
return foo;
}

This way the automatically created instance immediately has all the defaults I 
have defined.

Assuming this naming my code will be readable because it will look like this:
public void setSomething(String value) {
myStruct
.getAlwaysFoo()
.getAlwaysBar()
.getAlwaysOne()
.getAlwaysOther()
.setSomething(value);
}




 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes

 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type 

[jira] [Updated] (AVRO-1614) Always getting a value...

2014-11-27 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Attachment: AVRO-1614-20141027-v1.patch

First draft patch that does this.

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
 Attachments: AVRO-1614-20141027-v1.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1614) Always getting a value...

2014-11-27 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227822#comment-14227822
 ] 

Niels Basjes commented on AVRO-1614:


The idea works but I also found that for my usecase it is not very pleasant to 
work with.

Assume this example again:

{code}
public void setSomething(String value) {
myStruct
.getAlwaysFoo()
.getAlwaysBar()
.getAlwaysOne()
.getAlwaysOther()
.setSomething(value);
}
{code}

The main problem is that in order to do .getAlwaysOne() I MUST define ALL 
fields of that type with a default value.
What I don;t like about that is that I want the schema definition to enforce 
the fact that some fields are mandatory.
By adding a default value to 'everything' I lose that capability of AVRO ... 
which I don't want.

At this point in time the only workaround this (for me major) issue is by 
introducing something where I can do something like having a 'tree of 
incomplete Builders' and when I say 'build()' to the top one it will build the 
entire tree.


 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
 Attachments: AVRO-1614-20141027-v1.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1614) Always getting a value...

2014-12-01 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14229895#comment-14229895
 ] 

Niels Basjes commented on AVRO-1614:


Working on alternative approach.

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
 Attachments: AVRO-1614-20141027-v1.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-01 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Attachment: AVRO-1614-2014-12-01-v2.patch

First version of doing this via the Builder pattern.

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
 Attachments: AVRO-1614-2014-12-01-v2.patch, 
 AVRO-1614-20141027-v1.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1614) Always getting a value...

2014-12-01 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14229920#comment-14229920
 ] 

Niels Basjes commented on AVRO-1614:


Basic idea of this approach:
In a Builder in addition to the actual value of a field there is now also a 
Builder field for that field possible.
If that is used then you can have the incomplete form of the sub-schema in a 
Builder.
So for any Builder instance there is a getFooBuilder() that either returns the 
existing or creates a new Builder instance for the Foo field if such a builder 
is supported.

As a consequence: 
- schema validation is postponed until the actual build() is called.
- for the fields where this Builder is used the actual build() call becomes 
recursive.

So in my testing code I can now do this:
{code:Java}
Measurement.Builder measurementBuilder = Measurement.newBuilder();

measurementBuilder
.getTransportBuilder()
  .getConnectionBuilder()
.getNetworkConnectionBuilder()
  .setNetworkAddress(127.0.0.1)
  .setNetworkType(NetworkType.IPv4);

Measurement measurement = measurementBuilder.build();
{code}

Open question: I have not seen unit tests that validate the generated Java 
code. How to approach this?


 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
 Attachments: AVRO-1614-2014-12-01-v2.patch, 
 AVRO-1614-20141027-v1.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-01 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Release Note: [JAVA] Builders can now hold Builder instances of sub schemas.
  Status: Patch Available  (was: Open)

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
 Attachments: AVRO-1614-2014-12-01-v2.patch, 
 AVRO-1614-20141027-v1.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-01 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Status: Open  (was: Patch Available)

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
 Attachments: AVRO-1614-20141027-v1.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-01 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Attachment: (was: AVRO-1614-2014-12-01-v2.patch)

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
 Attachments: AVRO-1614-20141027-v1.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1616) Fix .gitignore to exclude IntelliJ files.

2014-12-01 Thread Niels Basjes (JIRA)
Niels Basjes created AVRO-1616:
--

 Summary: Fix .gitignore to exclude IntelliJ files.
 Key: AVRO-1616
 URL: https://issues.apache.org/jira/browse/AVRO-1616
 Project: Avro
  Issue Type: Improvement
Reporter: Niels Basjes
Priority: Trivial


The intellij project files are not ignored in .gitignore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1616) Fix .gitignore to exclude IntelliJ files.

2014-12-01 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1616:
---
Attachment: AVRO-1616-20141201-v1.patch

Simply added a few files to ignore in the future.

 Fix .gitignore to exclude IntelliJ files.
 -

 Key: AVRO-1616
 URL: https://issues.apache.org/jira/browse/AVRO-1616
 Project: Avro
  Issue Type: Improvement
Reporter: Niels Basjes
Priority: Trivial
 Attachments: AVRO-1616-20141201-v1.patch


 The intellij project files are not ignored in .gitignore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-01 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Attachment: AVRO-1614-20141201-v2.patch

Fixed the patch (previous one had a big mistake in it).

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
 Attachments: AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1616) Fix .gitignore to exclude IntelliJ files.

2014-12-01 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1616:
---
Status: Patch Available  (was: Open)

 Fix .gitignore to exclude IntelliJ files.
 -

 Key: AVRO-1616
 URL: https://issues.apache.org/jira/browse/AVRO-1616
 Project: Avro
  Issue Type: Improvement
Reporter: Niels Basjes
Priority: Trivial
 Attachments: AVRO-1616-20141201-v1.patch


 The intellij project files are not ignored in .gitignore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-01 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Status: Patch Available  (was: Open)

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
 Attachments: AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-01 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Status: Open  (was: Patch Available)

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
 Attachments: AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-02 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Attachment: AVRO-1614-20141202-v3.patch

Updated the patch:
- Fixed a bug (cloning a Builder now clones recursively)
- Fixed existing unit test
- Added specific unit tests for new feature.

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
 Attachments: AVRO-1614-20141027-v1.patch, 
 AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-02 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Labels: java  (was: )
Status: Patch Available  (was: Open)

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
  Labels: java
 Attachments: AVRO-1614-20141027-v1.patch, 
 AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1614) Always getting a value...

2014-12-03 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232827#comment-14232827
 ] 

Niels Basjes commented on AVRO-1614:


@[~cutting]: 

- Good idea to add additional tests to validate builder and Builder. I'll 
also be adding builderBuilder, value and this to ensure we covered all 
the edge cases.

- While making this I also found a lot of whitespaces changed. The main cause 
is that some of the files (the Player.java files) are generated and simply 
adding this feature changed those files a lot. Because I was already impacting 
those files so much I chose to kick all the trailing spaces in one go... which 
you say is too much, ok.
I understand the downside of this choice so I'll create a patch with the lowest 
possible whitespace changes. Shall I create a new issue afterwards to clean 
this up (I really like clean code!)?

- I don't quite understand the point regarding the tests. I put them under 
lang/java/ipc because there the compiler is available and can generate java 
code from the schema definitions. My tests are intended to validate that the 
generated code behaves as intended (I'm actually unit testing the code 
generated by the compiler). Putting them under ipc seemed like the best and 
easiest place. To avoid conflicts with existing testing code I added a new idl 
that resides in it's own package: org.apache.avro.test.http. So did I put the 
tests I added in the right place? If not, what is the right place? 


 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
  Labels: java
 Attachments: AVRO-1614-20141027-v1.patch, 
 AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-03 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Status: Open  (was: Patch Available)

Canceling patch to implement review feedback.

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
  Labels: java
 Attachments: AVRO-1614-20141027-v1.patch, 
 AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-04 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Status: Patch Available  (was: Open)

Please review

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
  Labels: java
 Attachments: AVRO-1614-20141027-v1.patch, 
 AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch, 
 AVRO-1614-20141204-v4.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-04 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Attachment: AVRO-1614-20141204-v4.patch

- Added new test idl to seekout colissions (builder, Builder, etc.).
- Fixed (existing) bug that came to light when using the record name Builder.
- Reduced the number of whitespace changes as much as possible.

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
  Labels: java
 Attachments: AVRO-1614-20141027-v1.patch, 
 AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch, 
 AVRO-1614-20141204-v4.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1619) Generate better JavaDoc

2014-12-10 Thread Niels Basjes (JIRA)
Niels Basjes created AVRO-1619:
--

 Summary: Generate better JavaDoc
 Key: AVRO-1619
 URL: https://issues.apache.org/jira/browse/AVRO-1619
 Project: Avro
  Issue Type: Improvement
  Components: java
Affects Versions: 1.7.7
Reporter: Niels Basjes


Assume the following IDL snippet:
{code}
@namespace(nl.basjes.avro.test)
protocol Something {
record MyRecord {
/** The time (epoch in milliseconds since 1970-01-01) */
longtimestamp;
}
}
{code}

The currently generated java code looks like this:

{code}
  /**
   * Gets the value of the 'timestamp' field.
   * The time (epoch in milliseconds since 1970-01-01) when the event occurred  
 */
  public java.lang.Long getTimestamp() {
return timestamp;
  }

  /**
   * Sets the value of the 'timestamp' field.
   * The time (epoch in milliseconds since 1970-01-01) when the event occurred  
 * @param value the value to set.
   */
  public void setTimestamp(java.lang.Long value) {
this.timestamp = value;
  }
{code}

Because the @param is not on a new line this is not shown in my IDE (IntelliJ 
14) as a parameter.

In addition the getters and setters within the Builder are missing these 
comments and the @param completely.

{code}
/** Gets the value of the 'timestamp' field */
public java.lang.Long getTimestamp() {
  return timestamp;
}

/** Sets the value of the 'timestamp' field */
public nl.basjes.avro.test.MyRecord.Builder setTimestamp(long value) {
  validate(fields()[0], value);
  this.timestamp = value;
  fieldSetFlags()[0] = true;
  return this; 
}
{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1619) Generate better JavaDoc

2014-12-11 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1619:
---
Attachment: AVRO-1619-2014-12-11-v1.patch

Unfortunately changing the comments also means massive changes in the two 
Player.java files.

 Generate better JavaDoc
 ---

 Key: AVRO-1619
 URL: https://issues.apache.org/jira/browse/AVRO-1619
 Project: Avro
  Issue Type: Improvement
  Components: java
Affects Versions: 1.7.7
Reporter: Niels Basjes
 Attachments: AVRO-1619-2014-12-11-v1.patch


 Assume the following IDL snippet:
 {code}
 @namespace(nl.basjes.avro.test)
 protocol Something {
 record MyRecord {
 /** The time (epoch in milliseconds since 1970-01-01) */
 longtimestamp;
 }
 }
 {code}
 The currently generated java code looks like this:
 {code}
   /**
* Gets the value of the 'timestamp' field.
* The time (epoch in milliseconds since 1970-01-01) when the event 
 occurred   */
   public java.lang.Long getTimestamp() {
 return timestamp;
   }
   /**
* Sets the value of the 'timestamp' field.
* The time (epoch in milliseconds since 1970-01-01) when the event 
 occurred   * @param value the value to set.
*/
   public void setTimestamp(java.lang.Long value) {
 this.timestamp = value;
   }
 {code}
 Because the @param is not on a new line this is not shown in my IDE 
 (IntelliJ 14) as a parameter.
 In addition the getters and setters within the Builder are missing these 
 comments and the @param completely.
 {code}
 /** Gets the value of the 'timestamp' field */
 public java.lang.Long getTimestamp() {
   return timestamp;
 }
 
 /** Sets the value of the 'timestamp' field */
 public nl.basjes.avro.test.MyRecord.Builder setTimestamp(long value) {
   validate(fields()[0], value);
   this.timestamp = value;
   fieldSetFlags()[0] = true;
   return this; 
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1619) Generate better JavaDoc

2014-12-11 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1619:
---
Status: Patch Available  (was: Open)

 Generate better JavaDoc
 ---

 Key: AVRO-1619
 URL: https://issues.apache.org/jira/browse/AVRO-1619
 Project: Avro
  Issue Type: Improvement
  Components: java
Affects Versions: 1.7.7
Reporter: Niels Basjes
 Attachments: AVRO-1619-2014-12-11-v1.patch


 Assume the following IDL snippet:
 {code}
 @namespace(nl.basjes.avro.test)
 protocol Something {
 record MyRecord {
 /** The time (epoch in milliseconds since 1970-01-01) */
 longtimestamp;
 }
 }
 {code}
 The currently generated java code looks like this:
 {code}
   /**
* Gets the value of the 'timestamp' field.
* The time (epoch in milliseconds since 1970-01-01) when the event 
 occurred   */
   public java.lang.Long getTimestamp() {
 return timestamp;
   }
   /**
* Sets the value of the 'timestamp' field.
* The time (epoch in milliseconds since 1970-01-01) when the event 
 occurred   * @param value the value to set.
*/
   public void setTimestamp(java.lang.Long value) {
 this.timestamp = value;
   }
 {code}
 Because the @param is not on a new line this is not shown in my IDE 
 (IntelliJ 14) as a parameter.
 In addition the getters and setters within the Builder are missing these 
 comments and the @param completely.
 {code}
 /** Gets the value of the 'timestamp' field */
 public java.lang.Long getTimestamp() {
   return timestamp;
 }
 
 /** Sets the value of the 'timestamp' field */
 public nl.basjes.avro.test.MyRecord.Builder setTimestamp(long value) {
   validate(fields()[0], value);
   this.timestamp = value;
   fieldSetFlags()[0] = true;
   return this; 
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-16 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Status: Open  (was: Patch Available)

Patch no longer merges after recent commits.

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
  Labels: java
 Attachments: AVRO-1614-20141027-v1.patch, 
 AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch, 
 AVRO-1614-20141204-v4.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-16 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Attachment: AVRO-1614-2014-12-16-v5.patch

- Patch updated so it merges again.
- Minor layout / javadoc tweaks compared to previous version.

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
  Labels: java
 Attachments: AVRO-1614-2014-12-16-v5.patch, 
 AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch, 
 AVRO-1614-20141202-v3.patch, AVRO-1614-20141204-v4.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1614) Always getting a value...

2014-12-16 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1614:
---
Status: Patch Available  (was: Open)

[~cutting] Updated patch

 Always getting a value...
 -

 Key: AVRO-1614
 URL: https://issues.apache.org/jira/browse/AVRO-1614
 Project: Avro
  Issue Type: New Feature
  Components: java
Reporter: Niels Basjes
  Labels: java
 Attachments: AVRO-1614-2014-12-16-v5.patch, 
 AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch, 
 AVRO-1614-20141202-v3.patch, AVRO-1614-20141204-v4.patch


 Sometimes the Avro structure becomes deeply nested.
 If in such a scenario you want to be able to set a specific value deep in 
 this tree you want to do this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getFoo()
 .getBar()
 .getOne()
 .getOther()
 .setSomething(value);
 }
 {code}
 The 'problem' I ran into is that any of the 4 get methods can return a null 
 value so the code I have to write is really huge.
 For every step in this method I have to build null checks and create the 
 underlying instance if it is null.
 I already started writing helper methods to do this for parts of my tree.
 To solve this in a way that makes this code readable I came up with the 
 following which I want to propose to you guys (before I start working on a 
 patch).
 My idea is to generate a new 'get' method in addition to the existing normal 
 get method for the regular instance of the class.
 So in addition to the 
 {code}
 public Foo getFoo() {
 return foo;
 }
 {code}
 I propose to generate something like this as well in the cases where this is 
 a type of structure that you may want to traverse as shown in the example.
 {code}
 public Foo getAlwaysFoo() {
 if (foo == null) {
 setFoo(Foo.newBuilder().build());
 }
 return foo;
 }
 {code}
 This way the automatically created instance immediately has all the defaults 
 I have defined.
 Assuming this naming my code will be readable because it will look like this:
 {code}
 public void setSomething(String value) {
 myStruct
 .getAlwaysFoo()
 .getAlwaysBar()
 .getAlwaysOne()
 .getAlwaysOther()
 .setSomething(value);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment

2014-12-16 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248477#comment-14248477
 ] 

Niels Basjes commented on AVRO-1537:


I'm just reading up on Docker and just had an Idea: 
What if instead of doing {code}# Clone Avro
RUN git clone http://git.apache.org/avro.git
RUN svn checkout https://svn.apache.org/repos/asf/avro/trunk/ avro-trunk{code}

You used something like (warning: I just read about this command; I didn't have 
time yet to try it out)
{code}
ONBUILD ADD . avro
{code}

See: https://docs.docker.com/reference/builder/#onbuild

That would add your local version of Avro (which will most likely contain the 
change you're working on) to the container.
In that scenario you would be able to run all tests for all languages/version 
on the patched code before submitting it.


 Make it easier to set up a multi-language build environment
 ---

 Key: AVRO-1537
 URL: https://issues.apache.org/jira/browse/AVRO-1537
 Project: Avro
  Issue Type: Improvement
Reporter: Martin Kleppmann
 Attachments: AVRO-1537.patch, AVRO-1537.patch


 It's currently quite tedious to set up an environment in which the Avro test 
 suites for all supported languages can be run, and in which release 
 candidates can be built. This is especially so when we need to test against 
 several different versions of a programming language or VM (e.g. 
 JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1).
 Our shared Hudson server isn't an ideal solution, because it only runs tests 
 on changes that are already committed, and maintenance of the server can't 
 easily be shared across the community.
 I think a Docker image might be a good solution, since it could be set up by 
 one person, shared with all Avro developers, and maintained by the community 
 on an ongoing basis. But other VM solutions (Vagrant, for example?) might 
 work just as well. Suggestions welcome.
 Related resources:
 * Using AWS (setting up an EC2 instance for Avro build and release): 
 https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease
 * Testing multiple versions of Ruby in CI: AVRO-1515



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment

2014-12-16 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248516#comment-14248516
 ] 

Niels Basjes commented on AVRO-1537:


Perhaps using this option with docker run is better:
{code}
 -v $(pwd):/somewhere/avro 
{code}


 Make it easier to set up a multi-language build environment
 ---

 Key: AVRO-1537
 URL: https://issues.apache.org/jira/browse/AVRO-1537
 Project: Avro
  Issue Type: Improvement
Reporter: Martin Kleppmann
 Attachments: AVRO-1537.patch, AVRO-1537.patch


 It's currently quite tedious to set up an environment in which the Avro test 
 suites for all supported languages can be run, and in which release 
 candidates can be built. This is especially so when we need to test against 
 several different versions of a programming language or VM (e.g. 
 JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1).
 Our shared Hudson server isn't an ideal solution, because it only runs tests 
 on changes that are already committed, and maintenance of the server can't 
 easily be shared across the community.
 I think a Docker image might be a good solution, since it could be set up by 
 one person, shared with all Avro developers, and maintained by the community 
 on an ongoing basis. But other VM solutions (Vagrant, for example?) might 
 work just as well. Suggestions welcome.
 Related resources:
 * Using AWS (setting up an EC2 instance for Avro build and release): 
 https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease
 * Testing multiple versions of Ruby in CI: AVRO-1515



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment

2014-12-17 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249839#comment-14249839
 ] 

Niels Basjes commented on AVRO-1537:


Hi [~tomwhite], I played around with the last version you uploaded. (Apparently 
I wasn't looking at the last version when I wrote the previous comment).

What I changed here locally is that I removed the last 3 lines from your 
Dockerfile (i.e. the git clone and svn checkout) and changed the build.sh like 
below.
Now you can do a simple {code}./build.sh safetest{code} and validate the entire 
build for all languages without any problems of needing to install anything 
tricky.
If I want to run a single command manually from within that environment I can 
simply do {code}./build.sh shell{code} and get a shell in the docker container.

Let me know what you think of this.

{code}
diff --git a/build.sh b/build.sh
index 06961c0..cff8dad 100755
--- a/build.sh
+++ b/build.sh
@@ -17,12 +17,16 @@

 set -e   # exit on error

-cd `dirname $0`# connect to root
+SOURCEDIR=$( cd $( dirname ${BASH_SOURCE[0]} )  pwd )
+cd ${SOURCEDIR}  # connect to root

 VERSION=`cat share/VERSION.txt`

+# The cryptic name is to avoid conflicts in case of a shared development 
system (like jenkins)
+DOCKER_IMAGE_NAME=avro_build_image_$(md5sum ${SOURCEDIR}/docker/Dockerfile | 
cut -d' ' -f1)
+
 function usage {
-  echo Usage: $0 {test|dist|sign|clean}
+  echo Usage: $0 {test|safetest|shell|dist|sign|clean}
   exit 1
 }

@@ -31,6 +35,13 @@ then
   usage
 fi

+function runindocker {
+docker build -t ${DOCKER_IMAGE_NAME} ${SOURCEDIR}/docker
+# By mapping the .m2 directory you can do an mvn install from within the 
container and use the result on your normal system.
+# And this also is a significant speedup in subsequent builds because the 
dependencies are downloaded only once.
+docker run --rm=true -t -i -v ${SOURCEDIR}:/root/avro -w /root/avro -v 
${HOME}/.m2:/root/.m2 ${DOCKER_IMAGE_NAME} $1 $2 $3 $4 $5 $6
+}
+
 set -x   # echo commands

 for target in $@
@@ -72,7 +83,14 @@ case $target in
 (cd lang/java; mvn package -DskipTests)
# run interop rpc test
 /bin/bash share/test/interop/bin/test_rpc_interop.sh
+   ;;
+
+safetest)
+   runindocker ./build.sh test
+   ;;

+shell)
+   runindocker /bin/bash
;;

 dist)

{code}


 Make it easier to set up a multi-language build environment
 ---

 Key: AVRO-1537
 URL: https://issues.apache.org/jira/browse/AVRO-1537
 Project: Avro
  Issue Type: Improvement
Reporter: Martin Kleppmann
 Attachments: AVRO-1537.patch, AVRO-1537.patch


 It's currently quite tedious to set up an environment in which the Avro test 
 suites for all supported languages can be run, and in which release 
 candidates can be built. This is especially so when we need to test against 
 several different versions of a programming language or VM (e.g. 
 JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1).
 Our shared Hudson server isn't an ideal solution, because it only runs tests 
 on changes that are already committed, and maintenance of the server can't 
 easily be shared across the community.
 I think a Docker image might be a good solution, since it could be set up by 
 one person, shared with all Avro developers, and maintained by the community 
 on an ongoing basis. But other VM solutions (Vagrant, for example?) might 
 work just as well. Suggestions welcome.
 Related resources:
 * Using AWS (setting up an EC2 instance for Avro build and release): 
 https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease
 * Testing multiple versions of Ruby in CI: AVRO-1515



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment

2014-12-18 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14251528#comment-14251528
 ] 

Niels Basjes commented on AVRO-1537:


Hi [~tomwhite], 
I verified on my system but running the scipt for the first time takes 'ages' 
and running it the second time I see the command prompt within two seconds.
So if you get very different results I suspect a problem on your system.

I agree that the 'safetest' name is a poor choice. Perhaps simply having the 
{{./build.sh shell}} is good enough for the first version?

I also noticed that all files in the avro directory become owned by {{root}} 
because that is the default user inside docker.
So after exiting the docker environment I could not longer do a {{mvn clean}} 
because the generated files (.class, .jar, etc) are all stored in directories 
owned by the root user.

In an attempt to solve this I came up with this.
I effectively create an additional layer that simply recreates the current user 
(same username, userid, groupid) in the docker environment.

{code}
function runindocker {
docker build -t ${DOCKER_IMAGE_NAME} ${SOURCEDIR}/docker
docker build -t ${DOCKER_IMAGE_NAME}_${USER} -  UserSpecificDocker
FROM ${DOCKER_IMAGE_NAME}
RUN groupadd -g $(id -g) ${USER}
RUN useradd  -g $(id -g) -u $(id -u) -k /root -m ${USER}
ENV HOME /home/${USER}
UserSpecificDocker
# By mapping the .m2 directory you can do an mvn install from within the 
container and use the result on your normal system.
# And this also is a significant speedup in subsequent builds because the 
dependencies are downloaded only once.
docker run --rm=true -t -i -v ${SOURCEDIR}:/home/${USER}/avro -w 
/home/${USER}/avro -v ${HOME}/.m2:/home/${USER}/.m2 -u ${USER} 
${DOCKER_IMAGE_NAME}_${USER} $1 $2 $3 $4 $5 $6
}
{code}

 Make it easier to set up a multi-language build environment
 ---

 Key: AVRO-1537
 URL: https://issues.apache.org/jira/browse/AVRO-1537
 Project: Avro
  Issue Type: Improvement
Reporter: Martin Kleppmann
Assignee: Tom White
 Attachments: AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch


 It's currently quite tedious to set up an environment in which the Avro test 
 suites for all supported languages can be run, and in which release 
 candidates can be built. This is especially so when we need to test against 
 several different versions of a programming language or VM (e.g. 
 JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1).
 Our shared Hudson server isn't an ideal solution, because it only runs tests 
 on changes that are already committed, and maintenance of the server can't 
 easily be shared across the community.
 I think a Docker image might be a good solution, since it could be set up by 
 one person, shared with all Avro developers, and maintained by the community 
 on an ongoing basis. But other VM solutions (Vagrant, for example?) might 
 work just as well. Suggestions welcome.
 Related resources:
 * Using AWS (setting up an EC2 instance for Avro build and release): 
 https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease
 * Testing multiple versions of Ruby in CI: AVRO-1515



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment

2014-12-19 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253656#comment-14253656
 ] 

Niels Basjes commented on AVRO-1537:


Hi [~tomwhite],

I checked your current patch and I really like it.
I did a test on CentOS 6.5 x86_64 inside a Virtual Box in Windows 7 and this 
works really great.
The first build takes a while and then subsequent runs are within a second on 
the screen.
The Java build went perfect and I let it run for a while and it all seems to 
work fine (until it all stops with the known bug {{./build.sh: line 25: 
node_modules/grunt/bin/grunt: No such file or directory}}).

A few feedback points:
# Using the naming like {{avro-build}} and {{avro-build-$USER}} means that 
there can only be a single images at a time with that name.
So in the scenario where you want to enhance the image (i.e. edit the 
Dockerfile) and then a different user on the same system cannot use the build 
system reliably; there will be conflicts between those two users. I realize 
this is an extremely unlikely scenario in normal development situations. I'm 
sure not how unlikely it becomes when we make this docker part of the normal 
build and run it on a shared CI environment (i.e. Jenkins) where everything 
runs as the same user.
I think it is fine to do this form as long as we realize (and preferably 
document) this caveat. Perhaps something a comment as simple as The build.sh 
docker environment is intended for use on personal development systems. 
should suffice.
# The Dockerfile does the installation of the various things by means of naming 
the tool that needs to be installed. Then this docker image will be different 
if for one of the tools a new version is released. So I was wondering; 
## What happens if in the future one of those tools creates a regression or a 
incompatible change in their API? Like the PHP NaN example? Is that a good 
thing because you're actually building and testing against what end users will 
be using too? Or is this a bad thing because you have a changing build 
environment?
## What happens if in the future one of those tools is updated and it is an 
update that is really needed for the build to succeed? How can it be 
validated/ensured that the image is updated too on the desktop of the 
developers? Perhaps the project can be enhanced to ensure/enforce minimal 
versions of the tools that are needed?
# I vote you push back the improvements towards HBase where you based the 
original on (at least make a ticket in HBase {{Evaluate Docker improvements 
from AVRO project}}).


 Make it easier to set up a multi-language build environment
 ---

 Key: AVRO-1537
 URL: https://issues.apache.org/jira/browse/AVRO-1537
 Project: Avro
  Issue Type: Improvement
Reporter: Martin Kleppmann
Assignee: Tom White
 Attachments: AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, 
 AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch


 It's currently quite tedious to set up an environment in which the Avro test 
 suites for all supported languages can be run, and in which release 
 candidates can be built. This is especially so when we need to test against 
 several different versions of a programming language or VM (e.g. 
 JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1).
 Our shared Hudson server isn't an ideal solution, because it only runs tests 
 on changes that are already committed, and maintenance of the server can't 
 easily be shared across the community.
 I think a Docker image might be a good solution, since it could be set up by 
 one person, shared with all Avro developers, and maintained by the community 
 on an ongoing basis. But other VM solutions (Vagrant, for example?) might 
 work just as well. Suggestions welcome.
 Related resources:
 * Using AWS (setting up an EC2 instance for Avro build and release): 
 https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease
 * Testing multiple versions of Ruby in CI: AVRO-1515



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1624) Surefire forkMode is deprecated

2014-12-19 Thread Niels Basjes (JIRA)
Niels Basjes created AVRO-1624:
--

 Summary: Surefire forkMode is deprecated
 Key: AVRO-1624
 URL: https://issues.apache.org/jira/browse/AVRO-1624
 Project: Avro
  Issue Type: Bug
  Components: java, trevni
Reporter: Niels Basjes


During a java build I see this warning message several times:
{code}
[INFO] --- maven-surefire-plugin:2.17:test (default-test) @ avro ---
[WARNING] The parameter forkMode is deprecated since version 2.14. Use 
forkCount and reuseForks instead.
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1624) Surefire forkMode is deprecated

2014-12-19 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253673#comment-14253673
 ] 

Niels Basjes commented on AVRO-1624:


Other projects have also gotten this:
https://github.com/apache/camel/pull/331/files

 Surefire forkMode is deprecated
 ---

 Key: AVRO-1624
 URL: https://issues.apache.org/jira/browse/AVRO-1624
 Project: Avro
  Issue Type: Bug
  Components: java, trevni
Reporter: Niels Basjes

 During a java build I see this warning message several times:
 {code}
 [INFO] --- maven-surefire-plugin:2.17:test (default-test) @ avro ---
 [WARNING] The parameter forkMode is deprecated since version 2.14. Use 
 forkCount and reuseForks instead.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1624) Surefire forkMode is deprecated

2014-12-19 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1624:
---
Attachment: AVRO-1624-2014-12-19-v1.patch

{code}
$ find . -type f -print0 | xargs -0 fgrep -i forkMode
./lang/java/pom.xml:forkModealways/forkMode
./lang/java/pom.xml:  forkModeonce/forkMode
{code}

http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html
{quote}
|Old Setting|New Setting|
|forkMode=once (default)|   forkCount=1 (default), reuseForks=true 
(default)|
|forkMode=always|forkCount=1 (default), reuseForks=false|
{quote}


 Surefire forkMode is deprecated
 ---

 Key: AVRO-1624
 URL: https://issues.apache.org/jira/browse/AVRO-1624
 Project: Avro
  Issue Type: Bug
  Components: java, trevni
Reporter: Niels Basjes
 Attachments: AVRO-1624-2014-12-19-v1.patch


 During a java build I see this warning message several times:
 {code}
 [INFO] --- maven-surefire-plugin:2.17:test (default-test) @ avro ---
 [WARNING] The parameter forkMode is deprecated since version 2.14. Use 
 forkCount and reuseForks instead.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1624) Surefire forkMode is deprecated

2014-12-19 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1624:
---
Status: Patch Available  (was: Open)

 Surefire forkMode is deprecated
 ---

 Key: AVRO-1624
 URL: https://issues.apache.org/jira/browse/AVRO-1624
 Project: Avro
  Issue Type: Bug
  Components: java, trevni
Reporter: Niels Basjes
 Attachments: AVRO-1624-2014-12-19-v1.patch


 During a java build I see this warning message several times:
 {code}
 [INFO] --- maven-surefire-plugin:2.17:test (default-test) @ avro ---
 [WARNING] The parameter forkMode is deprecated since version 2.14. Use 
 forkCount and reuseForks instead.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment

2014-12-19 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254113#comment-14254113
 ] 

Niels Basjes commented on AVRO-1537:


On Linux docker does not _require_ sudo.

In fact the docker manual says this:
https://docs.docker.com/installation/ubuntulinux/#giving-non-root-access
{quote}
Starting in version 0.5.3, if you (or your Docker installer) create a Unix 
group called docker and add users to it, then the docker daemon will make the 
ownership of the Unix socket read/writable by the docker group when the daemon 
starts. The docker daemon must always run as the root user, but if you run the 
docker client as a user in the docker group then you don't need to add sudo to 
all the client commands.
{quote}

I always set it up like this (i.e. adding my user to the docker group) and this 
means that the $SUDO_USER is always unset.

So to make both situations work I expect we should have something that only 
overwrites $USER with $SUDO_USER if it was set.


 Make it easier to set up a multi-language build environment
 ---

 Key: AVRO-1537
 URL: https://issues.apache.org/jira/browse/AVRO-1537
 Project: Avro
  Issue Type: Improvement
Reporter: Martin Kleppmann
Assignee: Tom White
 Attachments: AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, 
 AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch


 It's currently quite tedious to set up an environment in which the Avro test 
 suites for all supported languages can be run, and in which release 
 candidates can be built. This is especially so when we need to test against 
 several different versions of a programming language or VM (e.g. 
 JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1).
 Our shared Hudson server isn't an ideal solution, because it only runs tests 
 on changes that are already committed, and maintenance of the server can't 
 easily be shared across the community.
 I think a Docker image might be a good solution, since it could be set up by 
 one person, shared with all Avro developers, and maintained by the community 
 on an ongoing basis. But other VM solutions (Vagrant, for example?) might 
 work just as well. Suggestions welcome.
 Related resources:
 * Using AWS (setting up an EC2 instance for Avro build and release): 
 https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease
 * Testing multiple versions of Ruby in CI: AVRO-1515



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment

2014-12-20 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254963#comment-14254963
 ] 

Niels Basjes commented on AVRO-1537:


I just realized that the original issue states {quote}This is especially so 
when we need to test against several different versions of a programming 
language or VM (e.g. JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1).{quote}
What we have created so far is actually a good way of setting up the 'default' 
build environment (which is a very good first step).

Any idea's how we're going to approach the 'real question' ?

 Make it easier to set up a multi-language build environment
 ---

 Key: AVRO-1537
 URL: https://issues.apache.org/jira/browse/AVRO-1537
 Project: Avro
  Issue Type: Improvement
Reporter: Martin Kleppmann
Assignee: Tom White
 Attachments: AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, 
 AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch


 It's currently quite tedious to set up an environment in which the Avro test 
 suites for all supported languages can be run, and in which release 
 candidates can be built. This is especially so when we need to test against 
 several different versions of a programming language or VM (e.g. 
 JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1).
 Our shared Hudson server isn't an ideal solution, because it only runs tests 
 on changes that are already committed, and maintenance of the server can't 
 easily be shared across the community.
 I think a Docker image might be a good solution, since it could be set up by 
 one person, shared with all Avro developers, and maintained by the community 
 on an ongoing basis. But other VM solutions (Vagrant, for example?) might 
 work just as well. Suggestions welcome.
 Related resources:
 * Using AWS (setting up an EC2 instance for Avro build and release): 
 https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease
 * Testing multiple versions of Ruby in CI: AVRO-1515



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1630) Creating Builder from instance loses data

2015-02-03 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303019#comment-14303019
 ] 

Niels Basjes commented on AVRO-1630:


@[~tomwhite] What else is needed to get this committed?

 Creating Builder from instance loses data
 -

 Key: AVRO-1630
 URL: https://issues.apache.org/jira/browse/AVRO-1630
 Project: Avro
  Issue Type: Bug
  Components: java
Affects Versions: 1.8.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 1.8.0

 Attachments: AVRO-1630-2015-01-14-v1.patch


 If you create a builder from an instance and then use the .getXxxBuilder() 
 method you get an empty builder instead of a builder than contains the data 
 elements from the original instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1630) Creating Builder from instance loses data

2015-01-14 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277847#comment-14277847
 ] 

Niels Basjes commented on AVRO-1630:


I consider this a usage scenario I overlooked when writing AVRO-1614
[~cutting] / [~tomwhite] Can you guys please double check I haven't missed 
anything?

 Creating Builder from instance loses data
 -

 Key: AVRO-1630
 URL: https://issues.apache.org/jira/browse/AVRO-1630
 Project: Avro
  Issue Type: Bug
  Components: java
Affects Versions: 1.8.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 1.8.0

 Attachments: AVRO-1630-2015-01-14-v1.patch


 If you create a builder from an instance and then use the .getXxxBuilder() 
 method you get an empty builder instead of a builder than contains the data 
 elements from the original instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1633) Add additional setXxx(Builder) method to make user code more readable.

2015-01-20 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1633:
---
Description: 
The currently generated code contains these two methods for the Builder 
instances (code sample was simplified):
{code}public Request.Builder setConnection(NetworkConnection value)
public Request.Builder setConnectionBuilder(NetworkConnection.Builder 
value){code}

My proposal: Add in addition the method:
{code}public Request.Builder setConnection(NetworkConnection.Builder 
value){code}

Advantage:
You can do {{.setConnection(something)}} and pass either a 
{{NetworkConnection}} or a {{NetworkConnection.Builder}}.

Disadvantage:
Explicitly setting a {{null}} will trigger a Multiple implementations error 
and as such will need an explicit typecast.
This may be considered breaking backward compatibility!



  was:
The currently generated code contains these two methods for the Builder 
instances (code sample was simplified):
{code}public Request.Builder setConnection(NetworkConnection value)
public Request.Builder setConnectionBuilder(NetworkConnection.Builder 
value){code}

My proposal: Add in addition the method:
{code}public Request.Builder setConnection(NetworkConnection.Builder 
value){code}

Advantage:
You can do {{.setConnection(something)}} and pass either a 
{{NetworkConnection}} or a {{NetworkConnection.Builder}}.

Disadvantage:
Explicitly setting a {{null}} will trigger a Multiple implementations error 
and as such will need an explicit typecast.




 Add additional setXxx(Builder) method to make user code more readable.
 --

 Key: AVRO-1633
 URL: https://issues.apache.org/jira/browse/AVRO-1633
 Project: Avro
  Issue Type: Improvement
  Components: java
Affects Versions: 1.8.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 1.8.0

 Attachments: AVRO-1633-2015-01-20-v1.patch


 The currently generated code contains these two methods for the Builder 
 instances (code sample was simplified):
 {code}public Request.Builder setConnection(NetworkConnection value)
 public Request.Builder setConnectionBuilder(NetworkConnection.Builder 
 value){code}
 My proposal: Add in addition the method:
 {code}public Request.Builder setConnection(NetworkConnection.Builder 
 value){code}
 Advantage:
 You can do {{.setConnection(something)}} and pass either a 
 {{NetworkConnection}} or a {{NetworkConnection.Builder}}.
 Disadvantage:
 Explicitly setting a {{null}} will trigger a Multiple implementations error 
 and as such will need an explicit typecast.
 This may be considered breaking backward compatibility!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1633) Add additional setXxx(Builder) method to make user code more readable.

2015-01-20 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1633:
---
Description: 
The currently generated code contains these two methods for the Builder 
instances (code sample was simplified):
{code}public Request.Builder setConnection(NetworkConnection value)
public Request.Builder setConnectionBuilder(NetworkConnection.Builder 
value){code}

My proposal: Add in addition the method:
{code}public Request.Builder setConnection(NetworkConnection.Builder 
value){code}

Advantage:
* You can do {{.setConnection(something)}} and pass either a 
{{NetworkConnection}} or a {{NetworkConnection.Builder}}.
* User code becomes a bit more readable.

Disadvantages:
* Explicitly setting a {{null}} will trigger a Multiple implementations error 
and as such will need an explicit typecast.
** _This may be considered breaking backward compatibility!_
To solve this you must do either this:
{code}.setConnection((NetworkConnection)null){code}
or this {code}
NetworkConnection nc = null;
...
.setConnection(nc){code}


  was:
The currently generated code contains these two methods for the Builder 
instances (code sample was simplified):
{code}public Request.Builder setConnection(NetworkConnection value)
public Request.Builder setConnectionBuilder(NetworkConnection.Builder 
value){code}

My proposal: Add in addition the method:
{code}public Request.Builder setConnection(NetworkConnection.Builder 
value){code}

Advantage:
You can do {{.setConnection(something)}} and pass either a 
{{NetworkConnection}} or a {{NetworkConnection.Builder}}.

Disadvantage:
Explicitly setting a {{null}} will trigger a Multiple implementations error 
and as such will need an explicit typecast.
This may be considered breaking backward compatibility!




 Add additional setXxx(Builder) method to make user code more readable.
 --

 Key: AVRO-1633
 URL: https://issues.apache.org/jira/browse/AVRO-1633
 Project: Avro
  Issue Type: Improvement
  Components: java
Affects Versions: 1.8.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 1.8.0

 Attachments: AVRO-1633-2015-01-20-v1.patch


 The currently generated code contains these two methods for the Builder 
 instances (code sample was simplified):
 {code}public Request.Builder setConnection(NetworkConnection value)
 public Request.Builder setConnectionBuilder(NetworkConnection.Builder 
 value){code}
 My proposal: Add in addition the method:
 {code}public Request.Builder setConnection(NetworkConnection.Builder 
 value){code}
 Advantage:
 * You can do {{.setConnection(something)}} and pass either a 
 {{NetworkConnection}} or a {{NetworkConnection.Builder}}.
 * User code becomes a bit more readable.
 Disadvantages:
 * Explicitly setting a {{null}} will trigger a Multiple implementations 
 error and as such will need an explicit typecast.
 ** _This may be considered breaking backward compatibility!_
 To solve this you must do either this:
 {code}.setConnection((NetworkConnection)null){code}
 or this {code}
 NetworkConnection nc = null;
 ...
 .setConnection(nc){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1633) Add additional setXxx(Builder) method to make user code more readable.

2015-01-20 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1633:
---
Status: Patch Available  (was: Open)

Hi [~cutting],
Please classify this as either _wanted_ or _unwanted_ because of this backward 
compatibility point when setting a 'null' directly.
Thanks.


 Add additional setXxx(Builder) method to make user code more readable.
 --

 Key: AVRO-1633
 URL: https://issues.apache.org/jira/browse/AVRO-1633
 Project: Avro
  Issue Type: Improvement
  Components: java
Affects Versions: 1.8.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 1.8.0

 Attachments: AVRO-1633-2015-01-20-v1.patch


 The currently generated code contains these two methods for the Builder 
 instances (code sample was simplified):
 {code}public Request.Builder setConnection(NetworkConnection value)
 public Request.Builder setConnectionBuilder(NetworkConnection.Builder 
 value){code}
 My proposal: Add in addition the method:
 {code}public Request.Builder setConnection(NetworkConnection.Builder 
 value){code}
 Advantage:
 * You can do {{.setConnection(something)}} and pass either a 
 {{NetworkConnection}} or a {{NetworkConnection.Builder}}.
 * User code becomes a bit more readable.
 Disadvantages:
 * Explicitly setting a {{null}} will trigger a Multiple implementations 
 error and as such will need an explicit typecast.
 ** _This may be considered breaking backward compatibility!_
 To solve this you must do either this:
 {code}.setConnection((NetworkConnection)null){code}
 or this {code}
 NetworkConnection nc = null;
 ...
 .setConnection(nc){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1633) Add additional setXxx(Builder) method to make user code more readable.

2015-01-20 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1633:
---
Attachment: AVRO-1633-2015-01-20-v1.patch

The patch that implements the proposal.

 Add additional setXxx(Builder) method to make user code more readable.
 --

 Key: AVRO-1633
 URL: https://issues.apache.org/jira/browse/AVRO-1633
 Project: Avro
  Issue Type: Improvement
  Components: java
Affects Versions: 1.8.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 1.8.0

 Attachments: AVRO-1633-2015-01-20-v1.patch


 The currently generated code contains these two methods for the Builder 
 instances (code sample was simplified):
 {code}public Request.Builder setConnection(NetworkConnection value)
 public Request.Builder setConnectionBuilder(NetworkConnection.Builder 
 value){code}
 My proposal: Add in addition the method:
 {code}public Request.Builder setConnection(NetworkConnection.Builder 
 value){code}
 Advantage:
 You can do {{.setConnection(something)}} and pass either a 
 {{NetworkConnection}} or a {{NetworkConnection.Builder}}.
 Disadvantage:
 Explicitly setting a {{null}} will trigger a Multiple implementations error 
 and as such will need an explicit typecast.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1630) Creating Builder from instance loses data

2015-01-14 Thread Niels Basjes (JIRA)
Niels Basjes created AVRO-1630:
--

 Summary: Creating Builder from instance loses data
 Key: AVRO-1630
 URL: https://issues.apache.org/jira/browse/AVRO-1630
 Project: Avro
  Issue Type: Bug
  Components: java
Affects Versions: 1.8.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 1.8.0


If you create a builder from an instance and then use the .getXxxBuilder() 
method you get an empty builder instead of a builder than contains the data 
elements from the original instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1630) Creating Builder from instance loses data

2015-01-14 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1630:
---
Status: Patch Available  (was: Open)

 Creating Builder from instance loses data
 -

 Key: AVRO-1630
 URL: https://issues.apache.org/jira/browse/AVRO-1630
 Project: Avro
  Issue Type: Bug
  Components: java
Affects Versions: 1.8.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 1.8.0

 Attachments: AVRO-1630-2015-01-14-v1.patch


 If you create a builder from an instance and then use the .getXxxBuilder() 
 method you get an empty builder instead of a builder than contains the data 
 elements from the original instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1630) Creating Builder from instance loses data

2015-01-14 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1630:
---
Attachment: AVRO-1630-2015-01-14-v1.patch

This fix changes the getXxxBuilder method from this
{code}
public org.apache.avro.test.http.NetworkConnection.Builder 
getConnectionBuilder() {
  if (connectionBuilder == null) {
setConnectionBuilder(org.apache.avro.test.http.NetworkConnection.newBuilder());
  }
  return connectionBuilder;
}
{code}
into this
{code}
public org.apache.avro.test.http.NetworkConnection.Builder 
getConnectionBuilder() {
  if (connectionBuilder == null) {
if (hasConnection()) {
  
setConnectionBuilder(org.apache.avro.test.http.NetworkConnection.newBuilder(connection));
} else {
  
setConnectionBuilder(org.apache.avro.test.http.NetworkConnection.newBuilder());
}
  }
  return connectionBuilder;
}
{code}

 Creating Builder from instance loses data
 -

 Key: AVRO-1630
 URL: https://issues.apache.org/jira/browse/AVRO-1630
 Project: Avro
  Issue Type: Bug
  Components: java
Affects Versions: 1.8.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 1.8.0

 Attachments: AVRO-1630-2015-01-14-v1.patch


 If you create a builder from an instance and then use the .getXxxBuilder() 
 method you get an empty builder instead of a builder than contains the data 
 elements from the original instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment

2015-01-10 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14272447#comment-14272447
 ] 

Niels Basjes commented on AVRO-1537:


+1 (non-binding)

 Make it easier to set up a multi-language build environment
 ---

 Key: AVRO-1537
 URL: https://issues.apache.org/jira/browse/AVRO-1537
 Project: Avro
  Issue Type: Improvement
Reporter: Martin Kleppmann
Assignee: Tom White
 Attachments: AVRO-1537-2015-01-08.patch, AVRO-1537.patch, 
 AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, 
 AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, 
 AVRO-1537.patch, AVRO-1537.patch


 It's currently quite tedious to set up an environment in which the Avro test 
 suites for all supported languages can be run, and in which release 
 candidates can be built. This is especially so when we need to test against 
 several different versions of a programming language or VM (e.g. 
 JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1).
 Our shared Hudson server isn't an ideal solution, because it only runs tests 
 on changes that are already committed, and maintenance of the server can't 
 easily be shared across the community.
 I think a Docker image might be a good solution, since it could be set up by 
 one person, shared with all Avro developers, and maintained by the community 
 on an ongoing basis. But other VM solutions (Vagrant, for example?) might 
 work just as well. Suggestions welcome.
 Related resources:
 * Using AWS (setting up an EC2 instance for Avro build and release): 
 https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease
 * Testing multiple versions of Ruby in CI: AVRO-1515



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1633) Add additional setXxx(Builder) method to make user code more readable.

2015-04-30 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521500#comment-14521500
 ] 

Niels Basjes commented on AVRO-1633:


For me having such polymorphism in the API makes the API easier to use and 
increases code readability.

 Add additional setXxx(Builder) method to make user code more readable.
 --

 Key: AVRO-1633
 URL: https://issues.apache.org/jira/browse/AVRO-1633
 Project: Avro
  Issue Type: Improvement
  Components: java
Affects Versions: 1.8.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 1.8.0

 Attachments: AVRO-1633-2015-01-20-v1.patch


 The currently generated code contains these two methods for the Builder 
 instances (code sample was simplified):
 {code}public Request.Builder setConnection(NetworkConnection value)
 public Request.Builder setConnectionBuilder(NetworkConnection.Builder 
 value){code}
 My proposal: Add in addition the method:
 {code}public Request.Builder setConnection(NetworkConnection.Builder 
 value){code}
 Advantage:
 * You can do {{.setConnection(something)}} and pass either a 
 {{NetworkConnection}} or a {{NetworkConnection.Builder}}.
 * User code becomes a bit more readable.
 Disadvantages:
 * Explicitly setting a {{null}} will trigger a Multiple implementations 
 error and as such will need an explicit typecast.
 ** _This may be considered breaking backward compatibility!_
 To solve this you must do either this:
 {code}.setConnection((NetworkConnection)null){code}
 or this {code}
 NetworkConnection nc = null;
 ...
 .setConnection(nc){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1633) Add additional setXxx(Builder) method to make user code more readable.

2015-08-18 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1633:
---
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

 Add additional setXxx(Builder) method to make user code more readable.
 --

 Key: AVRO-1633
 URL: https://issues.apache.org/jira/browse/AVRO-1633
 Project: Avro
  Issue Type: Improvement
  Components: java
Affects Versions: 1.8.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 1.8.0

 Attachments: AVRO-1633-2015-01-20-v1.patch


 The currently generated code contains these two methods for the Builder 
 instances (code sample was simplified):
 {code}public Request.Builder setConnection(NetworkConnection value)
 public Request.Builder setConnectionBuilder(NetworkConnection.Builder 
 value){code}
 My proposal: Add in addition the method:
 {code}public Request.Builder setConnection(NetworkConnection.Builder 
 value){code}
 Advantage:
 * You can do {{.setConnection(something)}} and pass either a 
 {{NetworkConnection}} or a {{NetworkConnection.Builder}}.
 * User code becomes a bit more readable.
 Disadvantages:
 * Explicitly setting a {{null}} will trigger a Multiple implementations 
 error and as such will need an explicit typecast.
 ** _This may be considered breaking backward compatibility!_
 To solve this you must do either this:
 {code}.setConnection((NetworkConnection)null){code}
 or this {code}
 NetworkConnection nc = null;
 ...
 .setConnection(nc){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro

2016-03-10 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189473#comment-15189473
 ] 

Niels Basjes commented on AVRO-1704:


Note that having the "AVRO" prefix will also limit the number of needless calls 
to the Schema registry when bad records are put into the stream (like the Timer 
ticks example).

> Standardized format for encoding messages with Avro
> ---
>
> Key: AVRO-1704
> URL: https://issues.apache.org/jira/browse/AVRO-1704
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>
> I'm currently using the Datafile format for encoding messages that are 
> written to Kafka and Cassandra. This seems rather wasteful:
> 1. I only encode a single record at a time, so there's no need for sync 
> markers and other metadata related to multi-record files.
> 2. The entire schema is inlined every time.
> However, the Datafile format is the only one that has been standardized, 
> meaning that I can read and write data with minimal effort across the various 
> languages in use in my organization. If there was a standardized format for 
> encoding single values that was optimized for out-of-band schema transfer, I 
> would much rather use that.
> I think the necessary pieces of the format would be:
> 1. A format version number.
> 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc.
> 3. The actual schema fingerprint (according to the type.)
> 4. Optional metadata map.
> 5. The encoded datum.
> The language libraries would implement a MessageWriter that would encode 
> datums in this format, as well as a MessageReader that, given a SchemaStore, 
> would be able to decode datums. The reader would decode the fingerprint and 
> ask its SchemaStore to return the corresponding writer's schema.
> The idea is that SchemaStore would be an abstract interface that allowed 
> library users to inject custom backends. A simple, file system based one 
> could be provided out of the box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro

2016-03-10 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189402#comment-15189402
 ] 

Niels Basjes commented on AVRO-1704:


I've been looking into what kind of solution would work here since I'm working 
on a project where we need datastructures going into Kafka and be available to 
multiple consumers.

The fundamental problem we need to solve is that of "Schema Evolution" in a 
streaming environment (Let's assume Kafka with the built in persistence of 
records).
We need three things to make this happen:
# A way to recognize a 'blob' is a serialized AVRO record.
#* We can simply assume it is always an AVRO record. 
#* I think we should simply let such a record start with "AVRO" to ensure we 
can cleanly catch problems like this STORM-512 (Summary: Timer ticks we written 
into Kafka which caused a lot of deserialization errors in reading the AVRO 
records.)
# A way to determine the schema this was written with.
#* As indicated above I vote for using the CRC-64-AVRO. 
#** I noticed that a simple typo fix in the documentation of a Schema causes a 
new fingerprint to be generated. 
#** Proposal: I think we should 'clean' the schema before calculating the 
fingerprint. I.e. remove the things that do not impact the binary form of the 
record (like the doc field).
# Have a place where we can find the schemas using the fingerprint as the key.
#* Here I think (looking at AVRO-1124 and the fact that there are ready to run 
implementations like this [Schema 
Registry|http://docs.confluent.io/current/schema-registry/docs/index.html]) we 
should limit what we keep inside Avro to something like a "SchemaFactory" 
interface (as the storage/retrieval interface to get a Schema) and a very basic 
implementation that simply reads the available schema's from a (set of) 
property file(s). Using this others can write additional implementations that 
can read/write to things like databases or the above mentioned Schema Registry.

So to summarize my proposal on the standard for the {{Single record 
serialization format}} can be written as:
{code}"AVRO"{code}

[~rdblue], I'm seeking feedback from you guys on this proposal. 


> Standardized format for encoding messages with Avro
> ---
>
> Key: AVRO-1704
> URL: https://issues.apache.org/jira/browse/AVRO-1704
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>
> I'm currently using the Datafile format for encoding messages that are 
> written to Kafka and Cassandra. This seems rather wasteful:
> 1. I only encode a single record at a time, so there's no need for sync 
> markers and other metadata related to multi-record files.
> 2. The entire schema is inlined every time.
> However, the Datafile format is the only one that has been standardized, 
> meaning that I can read and write data with minimal effort across the various 
> languages in use in my organization. If there was a standardized format for 
> encoding single values that was optimized for out-of-band schema transfer, I 
> would much rather use that.
> I think the necessary pieces of the format would be:
> 1. A format version number.
> 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc.
> 3. The actual schema fingerprint (according to the type.)
> 4. Optional metadata map.
> 5. The encoded datum.
> The language libraries would implement a MessageWriter that would encode 
> datums in this format, as well as a MessageReader that, given a SchemaStore, 
> would be able to decode datums. The reader would decode the fingerprint and 
> ask its SchemaStore to return the corresponding writer's schema.
> The idea is that SchemaStore would be an abstract interface that allowed 
> library users to inject custom backends. A simple, file system based one 
> could be provided out of the box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (AVRO-1704) Standardized format for encoding messages with Avro

2016-03-11 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190866#comment-15190866
 ] 

Niels Basjes edited comment on AVRO-1704 at 3/11/16 1:00 PM:
-

Thanks for pointing this out. 

My updated proposal for this:
{code}Avro{code}
Where 
# "version" = 1 byte indicating the version (or "schema") of the rest of the 
bytes. 
if version == 0x00
# "Fingerprint" = the CRC-64-AVRO of the Canonical form of the Schema.
# "Record" = the record serialized to byte using the existing serialization 
system.

I personally do not like these 'chopped' prefixes if there is no "really good 
reason to chop them" (like the length). 
Because the projects name is so short: In this proposal I'm sticking to using 
the full name of the project as the prefix: "Avro" (i.e. these 4 bytes 0x41, 
0x76, 0x72, 0x6F)



was (Author: nielsbasjes):
Thanks for pointing this out. 

My updated proposal for this:
{code}"Avro"{code}
Where 
# "version" = 1 byte indicating the version (or "schema") of the rest of the 
bytes. 
if version == 0x00
# "Fingerprint" = the CRC-64-AVRO of the Canonical form of the Schema.
# "Record" = the record serialized to byte using the existing serialization 
system.

I personally do not like these 'chopped' prefixes if there is no "really good 
reason to chop them" (like the length). 
Because the projects name is so short: In this proposal I'm sticking to using 
the full name of the project as the prefix: "Avro" (i.e. these 4 bytes 0x41, 
0x76, 0x72, 0x6F)


> Standardized format for encoding messages with Avro
> ---
>
> Key: AVRO-1704
> URL: https://issues.apache.org/jira/browse/AVRO-1704
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>
> I'm currently using the Datafile format for encoding messages that are 
> written to Kafka and Cassandra. This seems rather wasteful:
> 1. I only encode a single record at a time, so there's no need for sync 
> markers and other metadata related to multi-record files.
> 2. The entire schema is inlined every time.
> However, the Datafile format is the only one that has been standardized, 
> meaning that I can read and write data with minimal effort across the various 
> languages in use in my organization. If there was a standardized format for 
> encoding single values that was optimized for out-of-band schema transfer, I 
> would much rather use that.
> I think the necessary pieces of the format would be:
> 1. A format version number.
> 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc.
> 3. The actual schema fingerprint (according to the type.)
> 4. Optional metadata map.
> 5. The encoded datum.
> The language libraries would implement a MessageWriter that would encode 
> datums in this format, as well as a MessageReader that, given a SchemaStore, 
> would be able to decode datums. The reader would decode the fingerprint and 
> ask its SchemaStore to return the corresponding writer's schema.
> The idea is that SchemaStore would be an abstract interface that allowed 
> library users to inject custom backends. A simple, file system based one 
> could be provided out of the box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro

2016-03-11 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190866#comment-15190866
 ] 

Niels Basjes commented on AVRO-1704:


Thanks for pointing this out. 

My updated proposal for this:
{code}"Avro"{code}
Where 
# "version" = 1 byte indicating the version (or "schema") of the rest of the 
bytes. 
if version == 0x00
# "Fingerprint" = the CRC-64-AVRO of the Canonical form of the Schema.
# "Record" = the record serialized to byte using the existing serialization 
system.

I personally do not like these 'chopped' prefixes if there is no "really good 
reason to chop them" (like the length). 
Because the projects name is so short: In this proposal I'm sticking to using 
the full name of the project as the prefix: "Avro" (i.e. these 4 bytes 0x41, 
0x76, 0x72, 0x6F)


> Standardized format for encoding messages with Avro
> ---
>
> Key: AVRO-1704
> URL: https://issues.apache.org/jira/browse/AVRO-1704
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>
> I'm currently using the Datafile format for encoding messages that are 
> written to Kafka and Cassandra. This seems rather wasteful:
> 1. I only encode a single record at a time, so there's no need for sync 
> markers and other metadata related to multi-record files.
> 2. The entire schema is inlined every time.
> However, the Datafile format is the only one that has been standardized, 
> meaning that I can read and write data with minimal effort across the various 
> languages in use in my organization. If there was a standardized format for 
> encoding single values that was optimized for out-of-band schema transfer, I 
> would much rather use that.
> I think the necessary pieces of the format would be:
> 1. A format version number.
> 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc.
> 3. The actual schema fingerprint (according to the type.)
> 4. Optional metadata map.
> 5. The encoded datum.
> The language libraries would implement a MessageWriter that would encode 
> datums in this format, as well as a MessageReader that, given a SchemaStore, 
> would be able to decode datums. The reader would decode the fingerprint and 
> ask its SchemaStore to return the corresponding writer's schema.
> The idea is that SchemaStore would be an abstract interface that allowed 
> library users to inject custom backends. A simple, file system based one 
> could be provided out of the box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro

2016-04-13 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238996#comment-15238996
 ] 

Niels Basjes commented on AVRO-1704:


I have a first addition: Think about supporting encrytion. 

> Standardized format for encoding messages with Avro
> ---
>
> Key: AVRO-1704
> URL: https://issues.apache.org/jira/browse/AVRO-1704
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>Assignee: Niels Basjes
> Attachments: AVRO-1704-20160410.patch
>
>
> I'm currently using the Datafile format for encoding messages that are 
> written to Kafka and Cassandra. This seems rather wasteful:
> 1. I only encode a single record at a time, so there's no need for sync 
> markers and other metadata related to multi-record files.
> 2. The entire schema is inlined every time.
> However, the Datafile format is the only one that has been standardized, 
> meaning that I can read and write data with minimal effort across the various 
> languages in use in my organization. If there was a standardized format for 
> encoding single values that was optimized for out-of-band schema transfer, I 
> would much rather use that.
> I think the necessary pieces of the format would be:
> 1. A format version number.
> 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc.
> 3. The actual schema fingerprint (according to the type.)
> 4. Optional metadata map.
> 5. The encoded datum.
> The language libraries would implement a MessageWriter that would encode 
> datums in this format, as well as a MessageReader that, given a SchemaStore, 
> would be able to decode datums. The reader would decode the fingerprint and 
> ask its SchemaStore to return the corresponding writer's schema.
> The idea is that SchemaStore would be an abstract interface that allowed 
> library users to inject custom backends. A simple, file system based one 
> could be provided out of the box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1826) build.sh rat fails over extra license files and many others.

2016-04-13 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238698#comment-15238698
 ] 

Niels Basjes commented on AVRO-1826:


Yes, that is an unintended mistake. I'll fix it and commit this weekend. Thanks

> build.sh rat fails over extra license files and many others.
> 
>
> Key: AVRO-1826
> URL: https://issues.apache.org/jira/browse/AVRO-1826
> Project: Avro
>  Issue Type: Bug
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Attachments: AVRO-1826-20160410.patch
>
>
> When running ./build.sh rat this will fail due to several license related 
> files we recently added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (AVRO-1704) Standardized format for encoding messages with Avro

2016-04-08 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes reassigned AVRO-1704:
--

Assignee: Niels Basjes

> Standardized format for encoding messages with Avro
> ---
>
> Key: AVRO-1704
> URL: https://issues.apache.org/jira/browse/AVRO-1704
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>Assignee: Niels Basjes
>
> I'm currently using the Datafile format for encoding messages that are 
> written to Kafka and Cassandra. This seems rather wasteful:
> 1. I only encode a single record at a time, so there's no need for sync 
> markers and other metadata related to multi-record files.
> 2. The entire schema is inlined every time.
> However, the Datafile format is the only one that has been standardized, 
> meaning that I can read and write data with minimal effort across the various 
> languages in use in my organization. If there was a standardized format for 
> encoding single values that was optimized for out-of-band schema transfer, I 
> would much rather use that.
> I think the necessary pieces of the format would be:
> 1. A format version number.
> 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc.
> 3. The actual schema fingerprint (according to the type.)
> 4. Optional metadata map.
> 5. The encoded datum.
> The language libraries would implement a MessageWriter that would encode 
> datums in this format, as well as a MessageReader that, given a SchemaStore, 
> would be able to decode datums. The reader would decode the fingerprint and 
> ask its SchemaStore to return the corresponding writer's schema.
> The idea is that SchemaStore would be an abstract interface that allowed 
> library users to inject custom backends. A simple, file system based one 
> could be provided out of the box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1825) Allow running build.sh dist under git

2016-04-09 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1825:
---
Status: Patch Available  (was: Open)

> Allow running build.sh dist under git
> -
>
> Key: AVRO-1825
> URL: https://issues.apache.org/jira/browse/AVRO-1825
> Project: Avro
>  Issue Type: Improvement
>  Components: build
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Attachments: AVRO-1825-20160409.patch
>
>
> When working of a git clone instead of an svn checkout the build.sh dist 
> cannot run due to an explicit dependency on the fact that the working 
> directory must be an svn checkout.
> This should be a bit more flexible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1825) Allow running build.sh dist under git

2016-04-09 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1825:
---
Attachment: AVRO-1825-20160409.patch

The patch

> Allow running build.sh dist under git
> -
>
> Key: AVRO-1825
> URL: https://issues.apache.org/jira/browse/AVRO-1825
> Project: Avro
>  Issue Type: Improvement
>  Components: build
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Attachments: AVRO-1825-20160409.patch
>
>
> When working of a git clone instead of an svn checkout the build.sh dist 
> cannot run due to an explicit dependency on the fact that the working 
> directory must be an svn checkout.
> This should be a bit more flexible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1825) Allow running build.sh dist under git

2016-04-09 Thread Niels Basjes (JIRA)
Niels Basjes created AVRO-1825:
--

 Summary: Allow running build.sh dist under git
 Key: AVRO-1825
 URL: https://issues.apache.org/jira/browse/AVRO-1825
 Project: Avro
  Issue Type: Improvement
  Components: build
Reporter: Niels Basjes
Assignee: Niels Basjes


When working of a git clone instead of an svn checkout the build.sh dist cannot 
run due to an explicit dependency on the fact that the working directory must 
be an svn checkout.
This should be a bit more flexible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1828) Add EditorConfig file

2016-04-12 Thread Niels Basjes (JIRA)
Niels Basjes created AVRO-1828:
--

 Summary: Add EditorConfig file
 Key: AVRO-1828
 URL: https://issues.apache.org/jira/browse/AVRO-1828
 Project: Avro
  Issue Type: Improvement
Reporter: Niels Basjes


I was working with Apache Flink last week and they recently implemented 
http://editorconfig.org/ ( see here 
https://github.com/apache/flink/blob/master/.editorconfig )

Essentially this is a very simple config file that instructs a great many 
editors to adhere to the main coding standard choices (things like character 
encoding, tabs v.s. spaces , newlines, etc) for a specific project on a per 
file type basis.

When someone opens the project in a intelliJ then this will automatically use 
these settings.

Proposal: 
# We implement this for Avro at the root level with global defaults.
# We implement a specific file per language. I think we should start with the 
top level scripting (like build.sh and pom.xml) and Java as the first language.
# We fix the violations of this standard in a single commit per language. Note 
that if we don't fix those violations then later commits will be 'harder' to 
keep clean (you will see a lot of unrelated changes) because the IDEs will 
'enforce' the standard on all touched files.

What do you guys think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1826) build.sh rat fails over extra licence files.

2016-04-10 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233994#comment-15233994
 ] 

Niels Basjes commented on AVRO-1826:


It also fails over files generated during the build test and dist steps. Even 
after a build clean many of these remain.

> build.sh rat fails over extra licence files.
> 
>
> Key: AVRO-1826
> URL: https://issues.apache.org/jira/browse/AVRO-1826
> Project: Avro
>  Issue Type: Bug
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>
> When running ./build.sh rat this will fail due to several license related 
> files we recently added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1826) build.sh rat fails over extra license files and many others.

2016-04-10 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1826:
---
Status: Patch Available  (was: Open)

> build.sh rat fails over extra license files and many others.
> 
>
> Key: AVRO-1826
> URL: https://issues.apache.org/jira/browse/AVRO-1826
> Project: Avro
>  Issue Type: Bug
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Attachments: AVRO-1826-20160410.patch
>
>
> When running ./build.sh rat this will fail due to several license related 
> files we recently added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1704) Standardized format for encoding messages with Avro

2016-04-10 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1704:
---
Attachment: AVRO-1704-20160410.patch

> Standardized format for encoding messages with Avro
> ---
>
> Key: AVRO-1704
> URL: https://issues.apache.org/jira/browse/AVRO-1704
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>Assignee: Niels Basjes
> Attachments: AVRO-1704-20160410.patch
>
>
> I'm currently using the Datafile format for encoding messages that are 
> written to Kafka and Cassandra. This seems rather wasteful:
> 1. I only encode a single record at a time, so there's no need for sync 
> markers and other metadata related to multi-record files.
> 2. The entire schema is inlined every time.
> However, the Datafile format is the only one that has been standardized, 
> meaning that I can read and write data with minimal effort across the various 
> languages in use in my organization. If there was a standardized format for 
> encoding single values that was optimized for out-of-band schema transfer, I 
> would much rather use that.
> I think the necessary pieces of the format would be:
> 1. A format version number.
> 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc.
> 3. The actual schema fingerprint (according to the type.)
> 4. Optional metadata map.
> 5. The encoded datum.
> The language libraries would implement a MessageWriter that would encode 
> datums in this format, as well as a MessageReader that, given a SchemaStore, 
> would be able to decode datums. The reader would decode the fingerprint and 
> ask its SchemaStore to return the corresponding writer's schema.
> The idea is that SchemaStore would be an abstract interface that allowed 
> library users to inject custom backends. A simple, file system based one 
> could be provided out of the box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'

2016-04-10 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1814:
---
Attachment: AVRO-1814-20160410.patch

Turns out that this problem in essence is a limitation of the way Java does 
resolution when there are name clashes between packages, classes, etc.

This patch at least mitigates the probability of this occurring in user 
applications. 

> 1.8 IDL generator broken when containing a field called 'org'
> -
>
> Key: AVRO-1814
> URL: https://issues.apache.org/jira/browse/AVRO-1814
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Dustin Spicuzza
>Assignee: Niels Basjes
> Attachments: AVRO-1814-20160410.patch
>
>
> The problem is in the generated 'readExternal' and 'writeExternal' functions, 
> because they do something like:
> WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out));
> When a member variable called 'org' exists, then the compile fails because 
> the compiler thinks that 'org' is a member variable and that 'apache cannot 
> be resolved or is not a field'. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1826) build.sh rat fails over extra license files and many others.

2016-04-10 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1826:
---
Attachment: AVRO-1826-20160410.patch

After running both ./build.sh test and dist, this will now let the build rat 
pass.

> build.sh rat fails over extra license files and many others.
> 
>
> Key: AVRO-1826
> URL: https://issues.apache.org/jira/browse/AVRO-1826
> Project: Avro
>  Issue Type: Bug
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Attachments: AVRO-1826-20160410.patch
>
>
> When running ./build.sh rat this will fail due to several license related 
> files we recently added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (AVRO-1704) Standardized format for encoding messages with Avro

2016-04-10 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15234304#comment-15234304
 ] 

Niels Basjes edited comment on AVRO-1704 at 4/10/16 10:10 PM:
--

During the last few weeks I spent some time figuring out what I think the 
format should be. I created this patch which includes specification for the new 
format, code generators for Java and unit tests that validate the format in 
light of schema evolution and corrupt data.

I documented the new format as follows:
{quote}
Schema tagged Binary Encoding specification

The wrapper format consists of a header and a body.
The header is always the 4 bytes representing the UTF-8 of the word "Avro" 
followed by a single byte indicating the version of the body format.

Version 0 of the body (currently the ONLY body format that has been defined) 
consists of:
#  the finger print (see the section about Schema Fingerprints of the schema (a 
64 bit long) that was written in the same byte order as a long is when written 
if it was a field in a record.
# the record serialized to byte using the binary encoding.
{quote}

Although I think this is already "pretty good" I really think this needs your 
comments and improvement suggestions.

Thanks.


was (Author: nielsbasjes):
During the last few weeks I spent some time figuring out what I think the 
format should be. I created this patch which includes specification for the new 
format, code generators for Java and unit tests that validate the format in 
light of schema evolution and corrupt data.

I documented the new format as follows:
{quote}
Schema tagged Binary Encoding specification

The wrapper format consists of a header and a body.
The header is always the 4 bytes representing the UTF-8 of the word "Avro" 
followed by a single byte indicating the version of the body format.

Version 0 of the body (currently the ONLY body format that has been defined) 
consists of:
#  the finger print (see the section about Schema Fingerprints of the schema (a 
64 bit long) that was written in the same byte order as a long is when written 
if it was a field in a record.
# the record serialized to byte using the binary encoding.
{quote}

Although I thing this is already "pretty good" I really think this needs your 
comments and improvement suggestions.

Thanks.

> Standardized format for encoding messages with Avro
> ---
>
> Key: AVRO-1704
> URL: https://issues.apache.org/jira/browse/AVRO-1704
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>Assignee: Niels Basjes
> Attachments: AVRO-1704-20160410.patch
>
>
> I'm currently using the Datafile format for encoding messages that are 
> written to Kafka and Cassandra. This seems rather wasteful:
> 1. I only encode a single record at a time, so there's no need for sync 
> markers and other metadata related to multi-record files.
> 2. The entire schema is inlined every time.
> However, the Datafile format is the only one that has been standardized, 
> meaning that I can read and write data with minimal effort across the various 
> languages in use in my organization. If there was a standardized format for 
> encoding single values that was optimized for out-of-band schema transfer, I 
> would much rather use that.
> I think the necessary pieces of the format would be:
> 1. A format version number.
> 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc.
> 3. The actual schema fingerprint (according to the type.)
> 4. Optional metadata map.
> 5. The encoded datum.
> The language libraries would implement a MessageWriter that would encode 
> datums in this format, as well as a MessageReader that, given a SchemaStore, 
> would be able to decode datums. The reader would decode the fingerprint and 
> ask its SchemaStore to return the corresponding writer's schema.
> The idea is that SchemaStore would be an abstract interface that allowed 
> library users to inject custom backends. A simple, file system based one 
> could be provided out of the box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1826) build.sh rat fails over extra license files and many others.

2016-04-10 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1826:
---
Summary: build.sh rat fails over extra license files and many others.  
(was: build.sh rat fails over extra licence files.)

> build.sh rat fails over extra license files and many others.
> 
>
> Key: AVRO-1826
> URL: https://issues.apache.org/jira/browse/AVRO-1826
> Project: Avro
>  Issue Type: Bug
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>
> When running ./build.sh rat this will fail due to several license related 
> files we recently added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1704) Standardized format for encoding messages with Avro

2016-04-10 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1704:
---
Status: Patch Available  (was: Open)

During the last few weeks I spent some time figuring out what I think the 
format should be. I created this patch which includes specification for the new 
format, code generators for Java and unit tests that validate the format in 
light of schema evolution and corrupt data.

I documented the new format as follows:
{quote}
Schema tagged Binary Encoding specification

The wrapper format consists of a header and a body.
The header is always the 4 bytes representing the UTF-8 of the word "Avro" 
followed by a single byte indicating the version of the body format.

Version 0 of the body (currently the ONLY body format that has been defined) 
consists of:
#  the finger print (see the section about Schema Fingerprints of the schema (a 
64 bit long) that was written in the same byte order as a long is when written 
if it was a field in a record.
# the record serialized to byte using the binary encoding.
{quote}

Although I thing this is already "pretty good" I really think this needs your 
comments and improvement suggestions.

Thanks.

> Standardized format for encoding messages with Avro
> ---
>
> Key: AVRO-1704
> URL: https://issues.apache.org/jira/browse/AVRO-1704
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>Assignee: Niels Basjes
> Attachments: AVRO-1704-20160410.patch
>
>
> I'm currently using the Datafile format for encoding messages that are 
> written to Kafka and Cassandra. This seems rather wasteful:
> 1. I only encode a single record at a time, so there's no need for sync 
> markers and other metadata related to multi-record files.
> 2. The entire schema is inlined every time.
> However, the Datafile format is the only one that has been standardized, 
> meaning that I can read and write data with minimal effort across the various 
> languages in use in my organization. If there was a standardized format for 
> encoding single values that was optimized for out-of-band schema transfer, I 
> would much rather use that.
> I think the necessary pieces of the format would be:
> 1. A format version number.
> 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc.
> 3. The actual schema fingerprint (according to the type.)
> 4. Optional metadata map.
> 5. The encoded datum.
> The language libraries would implement a MessageWriter that would encode 
> datums in this format, as well as a MessageReader that, given a SchemaStore, 
> would be able to decode datums. The reader would decode the fingerprint and 
> ask its SchemaStore to return the corresponding writer's schema.
> The idea is that SchemaStore would be an abstract interface that allowed 
> library users to inject custom backends. A simple, file system based one 
> could be provided out of the box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'

2016-04-10 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1814:
---
Status: Patch Available  (was: Open)

> 1.8 IDL generator broken when containing a field called 'org'
> -
>
> Key: AVRO-1814
> URL: https://issues.apache.org/jira/browse/AVRO-1814
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Dustin Spicuzza
>Assignee: Niels Basjes
> Attachments: AVRO-1814-20160410.patch
>
>
> The problem is in the generated 'readExternal' and 'writeExternal' functions, 
> because they do something like:
> WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out));
> When a member variable called 'org' exists, then the compile fails because 
> the compiler thinks that 'org' is a member variable and that 'apache cannot 
> be resolved or is not a field'. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1826) build.sh rat fails over extra licence files.

2016-04-09 Thread Niels Basjes (JIRA)
Niels Basjes created AVRO-1826:
--

 Summary: build.sh rat fails over extra licence files.
 Key: AVRO-1826
 URL: https://issues.apache.org/jira/browse/AVRO-1826
 Project: Avro
  Issue Type: Bug
Reporter: Niels Basjes
Assignee: Niels Basjes






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1826) build.sh rat fails over extra licence files.

2016-04-09 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1826:
---
Description: When running ./build.sh rat this will fail due to several 
license related files we recently added.

> build.sh rat fails over extra licence files.
> 
>
> Key: AVRO-1826
> URL: https://issues.apache.org/jira/browse/AVRO-1826
> Project: Avro
>  Issue Type: Bug
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>
> When running ./build.sh rat this will fail due to several license related 
> files we recently added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1825) Allow running build.sh dist under git

2016-04-11 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236098#comment-15236098
 ] 

Niels Basjes commented on AVRO-1825:


I'll commit this when I get back from the Hadoop Summit (Ireland).
I just need to read up on the exact procedure.

> Allow running build.sh dist under git
> -
>
> Key: AVRO-1825
> URL: https://issues.apache.org/jira/browse/AVRO-1825
> Project: Avro
>  Issue Type: Improvement
>  Components: build
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Attachments: AVRO-1825-20160409.patch
>
>
> When working of a git clone instead of an svn checkout the build.sh dist 
> cannot run due to an explicit dependency on the fact that the working 
> directory must be an svn checkout.
> This should be a bit more flexible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro

2016-03-24 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210535#comment-15210535
 ] 

Niels Basjes commented on AVRO-1704:


I did some experimenting over the last week and I posted my changed version of 
Avro here: https://github.com/nielsbasjes/avro/tree/AVRO-1704

What I did so far:
# Added to Schema the getFingerPrint() method that uses the CRC-64-AVRO to 
calculate the schema finger print.
# Added a few SchemaStorage related classes that allow storing schemas in 
memory.
# Added to the generated classes the toBytes() method and the fromBytes static 
method. Both effectively call the 'real' implementations which are in the 
SpecificRecordBase class.

All of this passes all of the Java unit testing.

At the application end my test code (using 3 slightly different variations of 
the same schema) looks like this. 
This works exactly as I expect it to.
{code:java}
SchemaFactory.put(com.bol.measure.v1.Measurement.getClassSchema());
SchemaFactory.put(com.bol.measure.v2.Measurement.getClassSchema());
SchemaFactory.put(com.bol.measure.v3.Measurement.getClassSchema());

com.bol.measure.v1.Measurement measurement = 
DummyMeasurementFactory.createTestMeasurement(timestamp);
byte[] bytesV1 = measurement.toBytes();

com.bol.measure.v2.Measurement newBornV2 = 
com.bol.measure.v2.Measurement.fromBytes(bytesV1);
com.bol.measure.v3.Measurement newBornV3 = 
com.bol.measure.v3.Measurement.fromBytes(bytesV1);
{code}

Things currently missing: Documentation, extra tests, etc.

I could really use some feedback on the structure of my change and advice on 
how to approach the need to call a 'close()' method on the schema storage part.

Thanks.

> Standardized format for encoding messages with Avro
> ---
>
> Key: AVRO-1704
> URL: https://issues.apache.org/jira/browse/AVRO-1704
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>
> I'm currently using the Datafile format for encoding messages that are 
> written to Kafka and Cassandra. This seems rather wasteful:
> 1. I only encode a single record at a time, so there's no need for sync 
> markers and other metadata related to multi-record files.
> 2. The entire schema is inlined every time.
> However, the Datafile format is the only one that has been standardized, 
> meaning that I can read and write data with minimal effort across the various 
> languages in use in my organization. If there was a standardized format for 
> encoding single values that was optimized for out-of-band schema transfer, I 
> would much rather use that.
> I think the necessary pieces of the format would be:
> 1. A format version number.
> 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc.
> 3. The actual schema fingerprint (according to the type.)
> 4. Optional metadata map.
> 5. The encoded datum.
> The language libraries would implement a MessageWriter that would encode 
> datums in this format, as well as a MessageReader that, given a SchemaStore, 
> would be able to decode datums. The reader would decode the fingerprint and 
> ask its SchemaStore to return the corresponding writer's schema.
> The idea is that SchemaStore would be an abstract interface that allowed 
> library users to inject custom backends. A simple, file system based one 
> could be provided out of the box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'

2016-03-25 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15211839#comment-15211839
 ] 

Niels Basjes commented on AVRO-1814:


I did a quick test and this is bigger than just this one. 
It also applies to the TLD of the company actually writing the software.
{code}@namespace("nl.basjes.test")
protocol Hacking {
record Hack {
string nl;
string org;
string com;
}
}{code}
gives similar errors about situations like this:
{code}super(nl.basjes.test.Hack.SCHEMA$);{code}


> 1.8 IDL generator broken when containing a field called 'org'
> -
>
> Key: AVRO-1814
> URL: https://issues.apache.org/jira/browse/AVRO-1814
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Dustin Spicuzza
>
> The problem is in the generated 'readExternal' and 'writeExternal' functions, 
> because they do something like:
> WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out));
> When a member variable called 'org' exists, then the compile fails because 
> the compiler thinks that 'org' is a member variable and that 'apache cannot 
> be resolved or is not a field'. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'

2016-03-25 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes reassigned AVRO-1814:
--

Assignee: Niels Basjes

> 1.8 IDL generator broken when containing a field called 'org'
> -
>
> Key: AVRO-1814
> URL: https://issues.apache.org/jira/browse/AVRO-1814
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Dustin Spicuzza
>Assignee: Niels Basjes
>
> The problem is in the generated 'readExternal' and 'writeExternal' functions, 
> because they do something like:
> WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out));
> When a member variable called 'org' exists, then the compile fails because 
> the compiler thinks that 'org' is a member variable and that 'apache cannot 
> be resolved or is not a field'. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'

2016-04-25 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15257020#comment-15257020
 ] 

Niels Basjes commented on AVRO-1814:


I did that because I wanted an explicit test case to verify the specifics of 
the TLD of the namespace in addition to the main TLD 'org'.
I guess it is fine to simply leave that part as-is (i.e. keep it org.apache. ).

> 1.8 IDL generator broken when containing a field called 'org'
> -
>
> Key: AVRO-1814
> URL: https://issues.apache.org/jira/browse/AVRO-1814
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Dustin Spicuzza
>Assignee: Niels Basjes
> Attachments: AVRO-1814-20160410.patch
>
>
> The problem is in the generated 'readExternal' and 'writeExternal' functions, 
> because they do something like:
> WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out));
> When a member variable called 'org' exists, then the compile fails because 
> the compiler thinks that 'org' is a member variable and that 'apache cannot 
> be resolved or is not a field'. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1834) Lower the Javadoc warnings on the generated code.

2016-04-25 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1834:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed

> Lower the Javadoc warnings on the generated code.
> -
>
> Key: AVRO-1834
> URL: https://issues.apache.org/jira/browse/AVRO-1834
> Project: Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Fix For: 1.8.1
>
> Attachments: AVRO-1834-2016-04-25.patch
>
>
> I see a LOT of JavaDoc related warnings on the generated code in Java.
> They are all about things like {{warning: no @param for}} and {{missing: 
> @return}}.
> In my work project this results in hundreds of warnings so they obfuscate the 
> things that do need attention.
> As these are generated I expect the required changes to be minimal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro

2016-04-22 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254092#comment-15254092
 ] 

Niels Basjes commented on AVRO-1704:


Question: What would be the preferred way of handling error situations like 
* Unknown schema fingerprint
* Bad set of bytes (in various forms)

I see at least in two general directions:
# Return null
# Throw an error

What is preferred in this case?
Which is 'better' for the application developers?

> Standardized format for encoding messages with Avro
> ---
>
> Key: AVRO-1704
> URL: https://issues.apache.org/jira/browse/AVRO-1704
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>Assignee: Niels Basjes
> Attachments: AVRO-1704-20160410.patch
>
>
> I'm currently using the Datafile format for encoding messages that are 
> written to Kafka and Cassandra. This seems rather wasteful:
> 1. I only encode a single record at a time, so there's no need for sync 
> markers and other metadata related to multi-record files.
> 2. The entire schema is inlined every time.
> However, the Datafile format is the only one that has been standardized, 
> meaning that I can read and write data with minimal effort across the various 
> languages in use in my organization. If there was a standardized format for 
> encoding single values that was optimized for out-of-band schema transfer, I 
> would much rather use that.
> I think the necessary pieces of the format would be:
> 1. A format version number.
> 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc.
> 3. The actual schema fingerprint (according to the type.)
> 4. Optional metadata map.
> 5. The encoded datum.
> The language libraries would implement a MessageWriter that would encode 
> datums in this format, as well as a MessageReader that, given a SchemaStore, 
> would be able to decode datums. The reader would decode the fingerprint and 
> ask its SchemaStore to return the corresponding writer's schema.
> The idea is that SchemaStore would be an abstract interface that allowed 
> library users to inject custom backends. A simple, file system based one 
> could be provided out of the box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1835) Running tests using JDK 1.8 complains about MaxPermSize

2016-04-28 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1835:
---
Attachment: AVRO-1835-2016-04-27.patch

[~rdblue] Good idea to use a profile. 
The attached patch works for me on my normal system (uses 1.8) and within our 
docker image (uses 1.7).
I verified by putting in some wrong values (causing lots of errors) that indeed 
in both cases the correct argLine value is selected when running tests.

Please verify.

> Running tests using JDK 1.8 complains about MaxPermSize
> ---
>
> Key: AVRO-1835
> URL: https://issues.apache.org/jira/browse/AVRO-1835
> Project: Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Fix For: 1.8.1
>
> Attachments: AVRO-1835-2016-04-25.patch, AVRO-1835-2016-04-27.patch
>
>
> When building AVRO under JDK 1.8 (as I assume most of us do) the output  
> contains the line {code}OpenJDK 64-Bit Server VM warning: ignoring option 
> MaxPermSize=200m; support was removed in 8.0{code}for every test class that 
> is run.
> The the output becomes cluttered like this:
> {code}
> ---
>  T E S T S
> ---
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestEncoders
> Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.467 sec - 
> in org.apache.avro.io.TestEncoders
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestBlockingIO2
> Tests run: 84, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.157 sec - 
> in org.apache.avro.io.TestBlockingIO2
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestBlockingIO
> Tests run: 376, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.347 sec - 
> in org.apache.avro.io.TestBlockingIO
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator
> Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.431 sec - 
> in org.apache.avro.io.parsing.TestResolvingGrammarGenerator
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator2
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.341 sec - 
> in org.apache.avro.io.parsing.TestResolvingGrammarGenerator2
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestResolvingIOResolving
> Tests run: 192, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.575 sec - 
> in org.apache.avro.io.TestResolvingIOResolving
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1834) Lower the Javadoc warnings on the generated code.

2016-04-25 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1834:
---
Description: 
I see a LOT of JavaDoc related warnings on the generated code in Java.
They are all about things like {{warning: no @param for}} and {{missing: 
@return}}.
In my work project this results in hundreds of warnings so they obfuscate the 
things that do need attention.

As these are generated I expect the required changes to be minimal.

  was:I see a LOT of JavaDoc related warnings on the generated code in Java


> Lower the Javadoc warnings on the generated code.
> -
>
> Key: AVRO-1834
> URL: https://issues.apache.org/jira/browse/AVRO-1834
> Project: Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Fix For: 1.8.1
>
>
> I see a LOT of JavaDoc related warnings on the generated code in Java.
> They are all about things like {{warning: no @param for}} and {{missing: 
> @return}}.
> In my work project this results in hundreds of warnings so they obfuscate the 
> things that do need attention.
> As these are generated I expect the required changes to be minimal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1834) Lower the Javadoc warnings on the generated code.

2016-04-25 Thread Niels Basjes (JIRA)
Niels Basjes created AVRO-1834:
--

 Summary: Lower the Javadoc warnings on the generated code.
 Key: AVRO-1834
 URL: https://issues.apache.org/jira/browse/AVRO-1834
 Project: Avro
  Issue Type: Improvement
  Components: java
Affects Versions: 1.8.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 1.8.1


I see a LOT of JavaDoc related warnings on the generated code in Java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1834) Lower the Javadoc warnings on the generated code.

2016-04-25 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1834:
---
Attachment: AVRO-1834-2016-04-25.patch

This patch adds only 3 extra lines in the record.vm (and as a consequence 
changes the generated Player.java files).

This change drops the number of Javadoc warnings in my own project from >100 to 
0.

> Lower the Javadoc warnings on the generated code.
> -
>
> Key: AVRO-1834
> URL: https://issues.apache.org/jira/browse/AVRO-1834
> Project: Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Fix For: 1.8.1
>
> Attachments: AVRO-1834-2016-04-25.patch
>
>
> I see a LOT of JavaDoc related warnings on the generated code in Java.
> They are all about things like {{warning: no @param for}} and {{missing: 
> @return}}.
> In my work project this results in hundreds of warnings so they obfuscate the 
> things that do need attention.
> As these are generated I expect the required changes to be minimal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'

2016-04-25 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256213#comment-15256213
 ] 

Niels Basjes commented on AVRO-1814:


[~rdblue] can you please have a quick look at this one?
Do we keep the old behavior (i.e. simply tell people "don't do this" and put 
this as a "Won't fix") or do we reduce the impact of this by means of the 
change I put in?


> 1.8 IDL generator broken when containing a field called 'org'
> -
>
> Key: AVRO-1814
> URL: https://issues.apache.org/jira/browse/AVRO-1814
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Dustin Spicuzza
>Assignee: Niels Basjes
> Attachments: AVRO-1814-20160410.patch
>
>
> The problem is in the generated 'readExternal' and 'writeExternal' functions, 
> because they do something like:
> WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out));
> When a member variable called 'org' exists, then the compile fails because 
> the compiler thinks that 'org' is a member variable and that 'apache cannot 
> be resolved or is not a field'. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1834) Lower the Javadoc warnings on the generated code.

2016-04-25 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1834:
---
Status: Patch Available  (was: Open)

> Lower the Javadoc warnings on the generated code.
> -
>
> Key: AVRO-1834
> URL: https://issues.apache.org/jira/browse/AVRO-1834
> Project: Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Fix For: 1.8.1
>
> Attachments: AVRO-1834-2016-04-25.patch
>
>
> I see a LOT of JavaDoc related warnings on the generated code in Java.
> They are all about things like {{warning: no @param for}} and {{missing: 
> @return}}.
> In my work project this results in hundreds of warnings so they obfuscate the 
> things that do need attention.
> As these are generated I expect the required changes to be minimal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1835) Running tests using JDK 1.8 complains about MaxPermSize

2016-04-25 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1835:
---
Status: Patch Available  (was: Open)

> Running tests using JDK 1.8 complains about MaxPermSize
> ---
>
> Key: AVRO-1835
> URL: https://issues.apache.org/jira/browse/AVRO-1835
> Project: Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Fix For: 1.8.1
>
> Attachments: AVRO-1835-2016-04-25.patch
>
>
> When building AVRO under JDK 1.8 (as I assume most of us do) the output  
> contains the line {code}OpenJDK 64-Bit Server VM warning: ignoring option 
> MaxPermSize=200m; support was removed in 8.0{code}for every test class that 
> is run.
> The the output becomes cluttered like this:
> {code}
> ---
>  T E S T S
> ---
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestEncoders
> Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.467 sec - 
> in org.apache.avro.io.TestEncoders
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestBlockingIO2
> Tests run: 84, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.157 sec - 
> in org.apache.avro.io.TestBlockingIO2
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestBlockingIO
> Tests run: 376, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.347 sec - 
> in org.apache.avro.io.TestBlockingIO
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator
> Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.431 sec - 
> in org.apache.avro.io.parsing.TestResolvingGrammarGenerator
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator2
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.341 sec - 
> in org.apache.avro.io.parsing.TestResolvingGrammarGenerator2
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestResolvingIOResolving
> Tests run: 192, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.575 sec - 
> in org.apache.avro.io.TestResolvingIOResolving
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1835) Running tests using JDK 1.8 complains about MaxPermSize

2016-04-28 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1835:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed

> Running tests using JDK 1.8 complains about MaxPermSize
> ---
>
> Key: AVRO-1835
> URL: https://issues.apache.org/jira/browse/AVRO-1835
> Project: Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Fix For: 1.8.1
>
> Attachments: AVRO-1835-2016-04-25.patch, AVRO-1835-2016-04-27.patch
>
>
> When building AVRO under JDK 1.8 (as I assume most of us do) the output  
> contains the line {code}OpenJDK 64-Bit Server VM warning: ignoring option 
> MaxPermSize=200m; support was removed in 8.0{code}for every test class that 
> is run.
> The the output becomes cluttered like this:
> {code}
> ---
>  T E S T S
> ---
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestEncoders
> Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.467 sec - 
> in org.apache.avro.io.TestEncoders
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestBlockingIO2
> Tests run: 84, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.157 sec - 
> in org.apache.avro.io.TestBlockingIO2
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestBlockingIO
> Tests run: 376, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.347 sec - 
> in org.apache.avro.io.TestBlockingIO
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator
> Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.431 sec - 
> in org.apache.avro.io.parsing.TestResolvingGrammarGenerator
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator2
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.341 sec - 
> in org.apache.avro.io.parsing.TestResolvingGrammarGenerator2
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestResolvingIOResolving
> Tests run: 192, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.575 sec - 
> in org.apache.avro.io.TestResolvingIOResolving
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (AVRO-1828) Add EditorConfig file

2016-04-28 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes reassigned AVRO-1828:
--

Assignee: Niels Basjes

> Add EditorConfig file
> -
>
> Key: AVRO-1828
> URL: https://issues.apache.org/jira/browse/AVRO-1828
> Project: Avro
>  Issue Type: Improvement
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>
> I was working with Apache Flink last week and they recently implemented 
> http://editorconfig.org/ ( see here 
> https://github.com/apache/flink/blob/master/.editorconfig )
> Essentially this is a very simple config file that instructs a great many 
> editors to adhere to the main coding standard choices (things like character 
> encoding, tabs v.s. spaces , newlines, etc) for a specific project on a per 
> file type basis.
> When someone opens the project in a intelliJ then this will automatically use 
> these settings.
> Proposal: 
> # We implement this for Avro at the root level with global defaults.
> # We implement a specific file per language. I think we should start with the 
> top level scripting (like build.sh and pom.xml) and Java as the first 
> language.
> # We fix the violations of this standard in a single commit per language. 
> Note that if we don't fix those violations then later commits will be 
> 'harder' to keep clean (you will see a lot of unrelated changes) because the 
> IDEs will 'enforce' the standard on all touched files.
> What do you guys think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1814) Generated java code fails on variables with a TLD name like 'org'

2016-04-28 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1814:
---
  Resolution: Fixed
Release Note: Using a variable name that also happens to be a toplevel 
domain name (like 'org') no longer causes errors. 
  Status: Resolved  (was: Patch Available)

Committed

> Generated java code fails on variables with a TLD name like 'org'
> -
>
> Key: AVRO-1814
> URL: https://issues.apache.org/jira/browse/AVRO-1814
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Dustin Spicuzza
>Assignee: Niels Basjes
> Attachments: AVRO-1814-20160410.patch, AVRO-1814-20160428.patch
>
>
> The problem is in the generated 'readExternal' and 'writeExternal' functions, 
> because they do something like:
> WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out));
> When a member variable called 'org' exists, then the compile fails because 
> the compiler thinks that 'org' is a member variable and that 'apache cannot 
> be resolved or is not a field'. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'

2016-04-28 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1814:
---
Attachment: AVRO-1814-20160428.patch

I put the original namespace back. 
This wasn't really needed to verify the problem.

> 1.8 IDL generator broken when containing a field called 'org'
> -
>
> Key: AVRO-1814
> URL: https://issues.apache.org/jira/browse/AVRO-1814
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Dustin Spicuzza
>Assignee: Niels Basjes
> Attachments: AVRO-1814-20160410.patch, AVRO-1814-20160428.patch
>
>
> The problem is in the generated 'readExternal' and 'writeExternal' functions, 
> because they do something like:
> WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out));
> When a member variable called 'org' exists, then the compile fails because 
> the compiler thinks that 'org' is a member variable and that 'apache cannot 
> be resolved or is not a field'. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1828) Add EditorConfig file

2016-04-28 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1828:
---
Attachment: AVRO-1828-2016-04-28.patch

This is the .editorconfig file for .sh, .xml and .java.

This patch also includes all these changes for the affected files:
- Remove trailing spaces and tabs
- Remove (leading) tabs

For the files where there were leading tabs I fixed the indentation (like in 
the toplevel build.sh and the build.sh scripts for several of the languages)

I chose not to touch the leading tabs in the documentation files at this moment.

This is mostly about spaces and tabs; so after applying this patch a command 
like {{git diff -w}} will yield almost no changes.

I need to run the full test set (all languages) on this one.
I ran Java and that passed. 

> Add EditorConfig file
> -
>
> Key: AVRO-1828
> URL: https://issues.apache.org/jira/browse/AVRO-1828
> Project: Avro
>  Issue Type: Improvement
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Attachments: AVRO-1828-2016-04-28.patch
>
>
> I was working with Apache Flink last week and they recently implemented 
> http://editorconfig.org/ ( see here 
> https://github.com/apache/flink/blob/master/.editorconfig )
> Essentially this is a very simple config file that instructs a great many 
> editors to adhere to the main coding standard choices (things like character 
> encoding, tabs v.s. spaces , newlines, etc) for a specific project on a per 
> file type basis.
> When someone opens the project in a intelliJ then this will automatically use 
> these settings.
> Proposal: 
> # We implement this for Avro at the root level with global defaults.
> # We implement a specific file per language. I think we should start with the 
> top level scripting (like build.sh and pom.xml) and Java as the first 
> language.
> # We fix the violations of this standard in a single commit per language. 
> Note that if we don't fix those violations then later commits will be 
> 'harder' to keep clean (you will see a lot of unrelated changes) because the 
> IDEs will 'enforce' the standard on all touched files.
> What do you guys think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1828) Add EditorConfig file

2016-04-28 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1828:
---
Attachment: AVRO-1828-2016-04-28-ratfix.patch

The new file failed {{./build.sh rat}}.

> Add EditorConfig file
> -
>
> Key: AVRO-1828
> URL: https://issues.apache.org/jira/browse/AVRO-1828
> Project: Avro
>  Issue Type: Improvement
>Affects Versions: 1.8.0
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Attachments: AVRO-1828-2016-04-28-ratfix.patch, 
> AVRO-1828-2016-04-28.patch
>
>
> I was working with Apache Flink last week and they recently implemented 
> http://editorconfig.org/ ( see here 
> https://github.com/apache/flink/blob/master/.editorconfig )
> Essentially this is a very simple config file that instructs a great many 
> editors to adhere to the main coding standard choices (things like character 
> encoding, tabs v.s. spaces , newlines, etc) for a specific project on a per 
> file type basis.
> When someone opens the project in a intelliJ then this will automatically use 
> these settings.
> Proposal: 
> # We implement this for Avro at the root level with global defaults.
> # We implement a specific file per language. I think we should start with the 
> top level scripting (like build.sh and pom.xml) and Java as the first 
> language.
> # We fix the violations of this standard in a single commit per language. 
> Note that if we don't fix those violations then later commits will be 
> 'harder' to keep clean (you will see a lot of unrelated changes) because the 
> IDEs will 'enforce' the standard on all touched files.
> What do you guys think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1828) Add EditorConfig file

2016-04-28 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1828:
---
Affects Version/s: 1.8.0
   Status: Patch Available  (was: Open)

Please review / comment

> Add EditorConfig file
> -
>
> Key: AVRO-1828
> URL: https://issues.apache.org/jira/browse/AVRO-1828
> Project: Avro
>  Issue Type: Improvement
>Affects Versions: 1.8.0
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Attachments: AVRO-1828-2016-04-28-ratfix.patch, 
> AVRO-1828-2016-04-28.patch
>
>
> I was working with Apache Flink last week and they recently implemented 
> http://editorconfig.org/ ( see here 
> https://github.com/apache/flink/blob/master/.editorconfig )
> Essentially this is a very simple config file that instructs a great many 
> editors to adhere to the main coding standard choices (things like character 
> encoding, tabs v.s. spaces , newlines, etc) for a specific project on a per 
> file type basis.
> When someone opens the project in a intelliJ then this will automatically use 
> these settings.
> Proposal: 
> # We implement this for Avro at the root level with global defaults.
> # We implement a specific file per language. I think we should start with the 
> top level scripting (like build.sh and pom.xml) and Java as the first 
> language.
> # We fix the violations of this standard in a single commit per language. 
> Note that if we don't fix those violations then later commits will be 
> 'harder' to keep clean (you will see a lot of unrelated changes) because the 
> IDEs will 'enforce' the standard on all touched files.
> What do you guys think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1704) Standardized format for encoding messages with Avro

2016-04-28 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1704:
---
Status: Open  (was: Patch Available)

> Standardized format for encoding messages with Avro
> ---
>
> Key: AVRO-1704
> URL: https://issues.apache.org/jira/browse/AVRO-1704
> Project: Avro
>  Issue Type: Improvement
>Reporter: Daniel Schierbeck
>Assignee: Niels Basjes
> Attachments: AVRO-1704-20160410.patch
>
>
> I'm currently using the Datafile format for encoding messages that are 
> written to Kafka and Cassandra. This seems rather wasteful:
> 1. I only encode a single record at a time, so there's no need for sync 
> markers and other metadata related to multi-record files.
> 2. The entire schema is inlined every time.
> However, the Datafile format is the only one that has been standardized, 
> meaning that I can read and write data with minimal effort across the various 
> languages in use in my organization. If there was a standardized format for 
> encoding single values that was optimized for out-of-band schema transfer, I 
> would much rather use that.
> I think the necessary pieces of the format would be:
> 1. A format version number.
> 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc.
> 3. The actual schema fingerprint (according to the type.)
> 4. Optional metadata map.
> 5. The encoded datum.
> The language libraries would implement a MessageWriter that would encode 
> datums in this format, as well as a MessageReader that, given a SchemaStore, 
> would be able to decode datums. The reader would decode the fingerprint and 
> ask its SchemaStore to return the corresponding writer's schema.
> The idea is that SchemaStore would be an abstract interface that allowed 
> library users to inject custom backends. A simple, file system based one 
> could be provided out of the box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1835) Running tests using JDK 1.8 complains about MaxPermSize

2016-04-25 Thread Niels Basjes (JIRA)
Niels Basjes created AVRO-1835:
--

 Summary: Running tests using JDK 1.8 complains about MaxPermSize
 Key: AVRO-1835
 URL: https://issues.apache.org/jira/browse/AVRO-1835
 Project: Avro
  Issue Type: Improvement
  Components: java
Affects Versions: 1.8.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 1.8.1


When building AVRO under JDK 1.8 (as I assume most of us do) the output  
contains the line {code}OpenJDK 64-Bit Server VM warning: ignoring option 
MaxPermSize=200m; support was removed in 8.0{code}for every test class that is 
run.
The the output becomes cluttered like this:
{code}
---
 T E S T S
---
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support was 
removed in 8.0
Running org.apache.avro.io.TestEncoders
Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.467 sec - in 
org.apache.avro.io.TestEncoders
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support was 
removed in 8.0
Running org.apache.avro.io.TestBlockingIO2
Tests run: 84, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.157 sec - in 
org.apache.avro.io.TestBlockingIO2
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support was 
removed in 8.0
Running org.apache.avro.io.TestBlockingIO
Tests run: 376, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.347 sec - 
in org.apache.avro.io.TestBlockingIO
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support was 
removed in 8.0
Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator
Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.431 sec - in 
org.apache.avro.io.parsing.TestResolvingGrammarGenerator
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support was 
removed in 8.0
Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator2
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.341 sec - in 
org.apache.avro.io.parsing.TestResolvingGrammarGenerator2
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support was 
removed in 8.0
Running org.apache.avro.io.TestResolvingIOResolving
Tests run: 192, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.575 sec - 
in org.apache.avro.io.TestResolvingIOResolving
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1835) Running tests using JDK 1.8 complains about MaxPermSize

2016-04-25 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1835:
---
Attachment: AVRO-1835-2016-04-25.patch

As this ONLY affects running tests I suspect it is safe to assume we have Java 
1.8 available.
I'm not sure about the current status of systems like Jenkins.
[~rdblue] perhaps we wait with this one until we have completed AVRO-1705 ?

> Running tests using JDK 1.8 complains about MaxPermSize
> ---
>
> Key: AVRO-1835
> URL: https://issues.apache.org/jira/browse/AVRO-1835
> Project: Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Fix For: 1.8.1
>
> Attachments: AVRO-1835-2016-04-25.patch
>
>
> When building AVRO under JDK 1.8 (as I assume most of us do) the output  
> contains the line {code}OpenJDK 64-Bit Server VM warning: ignoring option 
> MaxPermSize=200m; support was removed in 8.0{code}for every test class that 
> is run.
> The the output becomes cluttered like this:
> {code}
> ---
>  T E S T S
> ---
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestEncoders
> Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.467 sec - 
> in org.apache.avro.io.TestEncoders
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestBlockingIO2
> Tests run: 84, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.157 sec - 
> in org.apache.avro.io.TestBlockingIO2
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestBlockingIO
> Tests run: 376, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.347 sec - 
> in org.apache.avro.io.TestBlockingIO
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator
> Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.431 sec - 
> in org.apache.avro.io.parsing.TestResolvingGrammarGenerator
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator2
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.341 sec - 
> in org.apache.avro.io.parsing.TestResolvingGrammarGenerator2
> OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support 
> was removed in 8.0
> Running org.apache.avro.io.TestResolvingIOResolving
> Tests run: 192, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.575 sec - 
> in org.apache.avro.io.TestResolvingIOResolving
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1841) Add clientside githooks to do basic commit validation

2016-05-10 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1841:
---
Summary: Add clientside githooks to do basic commit validation  (was: 
Automatically verify the commit messages )

> Add clientside githooks to do basic commit validation
> -
>
> Key: AVRO-1841
> URL: https://issues.apache.org/jira/browse/AVRO-1841
> Project: Avro
>  Issue Type: Improvement
>  Components: build
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Fix For: 1.8.1
>
> Attachments: AVRO-1841-20160507.patch
>
>
> Last week I made a commit and I made an error: The commit message was not 
> fully according to the right format.
> To avoid future mistakes I propose we introduce validation to the commit 
> message by using the git hooks.
> These can be run in two places: client side and server side.
> This ticket focuses on the client side hooks. If we decide to add this also 
> to the server side that must be a separate ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1841) Add clientside githooks to do basic commit validation

2016-05-10 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated AVRO-1841:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed

> Add clientside githooks to do basic commit validation
> -
>
> Key: AVRO-1841
> URL: https://issues.apache.org/jira/browse/AVRO-1841
> Project: Avro
>  Issue Type: Improvement
>  Components: build
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Fix For: 1.8.1
>
> Attachments: AVRO-1841-20160507.patch
>
>
> Last week I made a commit and I made an error: The commit message was not 
> fully according to the right format.
> To avoid future mistakes I propose we introduce validation to the commit 
> message by using the git hooks.
> These can be run in two places: client side and server side.
> This ticket focuses on the client side hooks. If we decide to add this also 
> to the server side that must be a separate ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AVRO-1845) Invoking toString() method unexpectedly modified the avro record.

2016-05-13 Thread Niels Basjes (JIRA)
Niels Basjes created AVRO-1845:
--

 Summary: Invoking toString() method unexpectedly modified the avro 
record.
 Key: AVRO-1845
 URL: https://issues.apache.org/jira/browse/AVRO-1845
 Project: Avro
  Issue Type: Bug
Reporter: Niels Basjes
Assignee: Niels Basjes
Priority: Critical


Reported by Oleksandr Didukh (guthub uid: sashadidukh)

When calling the toString method on a record that has a bytearray this 
apparently changes the original data. 

Oleksandr put up a merge request in : https://github.com/apache/avro/pull/88



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >