[jira] [Created] (AVRO-1614) Always getting a value...
Niels Basjes created AVRO-1614: -- Summary: Always getting a value... Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the public Foo getFoo() { return foo; } I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Description: Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} was: Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the public Foo getFoo() { return foo; } I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Attachment: AVRO-1614-20141027-v1.patch First draft patch that does this. Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Attachments: AVRO-1614-20141027-v1.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227822#comment-14227822 ] Niels Basjes commented on AVRO-1614: The idea works but I also found that for my usecase it is not very pleasant to work with. Assume this example again: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} The main problem is that in order to do .getAlwaysOne() I MUST define ALL fields of that type with a default value. What I don;t like about that is that I want the schema definition to enforce the fact that some fields are mandatory. By adding a default value to 'everything' I lose that capability of AVRO ... which I don't want. At this point in time the only workaround this (for me major) issue is by introducing something where I can do something like having a 'tree of incomplete Builders' and when I say 'build()' to the top one it will build the entire tree. Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Attachments: AVRO-1614-20141027-v1.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14229895#comment-14229895 ] Niels Basjes commented on AVRO-1614: Working on alternative approach. Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Attachments: AVRO-1614-20141027-v1.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Attachment: AVRO-1614-2014-12-01-v2.patch First version of doing this via the Builder pattern. Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Attachments: AVRO-1614-2014-12-01-v2.patch, AVRO-1614-20141027-v1.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14229920#comment-14229920 ] Niels Basjes commented on AVRO-1614: Basic idea of this approach: In a Builder in addition to the actual value of a field there is now also a Builder field for that field possible. If that is used then you can have the incomplete form of the sub-schema in a Builder. So for any Builder instance there is a getFooBuilder() that either returns the existing or creates a new Builder instance for the Foo field if such a builder is supported. As a consequence: - schema validation is postponed until the actual build() is called. - for the fields where this Builder is used the actual build() call becomes recursive. So in my testing code I can now do this: {code:Java} Measurement.Builder measurementBuilder = Measurement.newBuilder(); measurementBuilder .getTransportBuilder() .getConnectionBuilder() .getNetworkConnectionBuilder() .setNetworkAddress(127.0.0.1) .setNetworkType(NetworkType.IPv4); Measurement measurement = measurementBuilder.build(); {code} Open question: I have not seen unit tests that validate the generated Java code. How to approach this? Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Attachments: AVRO-1614-2014-12-01-v2.patch, AVRO-1614-20141027-v1.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Release Note: [JAVA] Builders can now hold Builder instances of sub schemas. Status: Patch Available (was: Open) Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Attachments: AVRO-1614-2014-12-01-v2.patch, AVRO-1614-20141027-v1.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Status: Open (was: Patch Available) Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Attachments: AVRO-1614-20141027-v1.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Attachment: (was: AVRO-1614-2014-12-01-v2.patch) Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Attachments: AVRO-1614-20141027-v1.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AVRO-1616) Fix .gitignore to exclude IntelliJ files.
Niels Basjes created AVRO-1616: -- Summary: Fix .gitignore to exclude IntelliJ files. Key: AVRO-1616 URL: https://issues.apache.org/jira/browse/AVRO-1616 Project: Avro Issue Type: Improvement Reporter: Niels Basjes Priority: Trivial The intellij project files are not ignored in .gitignore -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1616) Fix .gitignore to exclude IntelliJ files.
[ https://issues.apache.org/jira/browse/AVRO-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1616: --- Attachment: AVRO-1616-20141201-v1.patch Simply added a few files to ignore in the future. Fix .gitignore to exclude IntelliJ files. - Key: AVRO-1616 URL: https://issues.apache.org/jira/browse/AVRO-1616 Project: Avro Issue Type: Improvement Reporter: Niels Basjes Priority: Trivial Attachments: AVRO-1616-20141201-v1.patch The intellij project files are not ignored in .gitignore -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Attachment: AVRO-1614-20141201-v2.patch Fixed the patch (previous one had a big mistake in it). Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Attachments: AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1616) Fix .gitignore to exclude IntelliJ files.
[ https://issues.apache.org/jira/browse/AVRO-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1616: --- Status: Patch Available (was: Open) Fix .gitignore to exclude IntelliJ files. - Key: AVRO-1616 URL: https://issues.apache.org/jira/browse/AVRO-1616 Project: Avro Issue Type: Improvement Reporter: Niels Basjes Priority: Trivial Attachments: AVRO-1616-20141201-v1.patch The intellij project files are not ignored in .gitignore -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Status: Patch Available (was: Open) Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Attachments: AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Status: Open (was: Patch Available) Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Attachments: AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Attachment: AVRO-1614-20141202-v3.patch Updated the patch: - Fixed a bug (cloning a Builder now clones recursively) - Fixed existing unit test - Added specific unit tests for new feature. Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Attachments: AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Labels: java (was: ) Status: Patch Available (was: Open) Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Labels: java Attachments: AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232827#comment-14232827 ] Niels Basjes commented on AVRO-1614: @[~cutting]: - Good idea to add additional tests to validate builder and Builder. I'll also be adding builderBuilder, value and this to ensure we covered all the edge cases. - While making this I also found a lot of whitespaces changed. The main cause is that some of the files (the Player.java files) are generated and simply adding this feature changed those files a lot. Because I was already impacting those files so much I chose to kick all the trailing spaces in one go... which you say is too much, ok. I understand the downside of this choice so I'll create a patch with the lowest possible whitespace changes. Shall I create a new issue afterwards to clean this up (I really like clean code!)? - I don't quite understand the point regarding the tests. I put them under lang/java/ipc because there the compiler is available and can generate java code from the schema definitions. My tests are intended to validate that the generated code behaves as intended (I'm actually unit testing the code generated by the compiler). Putting them under ipc seemed like the best and easiest place. To avoid conflicts with existing testing code I added a new idl that resides in it's own package: org.apache.avro.test.http. So did I put the tests I added in the right place? If not, what is the right place? Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Labels: java Attachments: AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Status: Open (was: Patch Available) Canceling patch to implement review feedback. Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Labels: java Attachments: AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Status: Patch Available (was: Open) Please review Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Labels: java Attachments: AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch, AVRO-1614-20141204-v4.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Attachment: AVRO-1614-20141204-v4.patch - Added new test idl to seekout colissions (builder, Builder, etc.). - Fixed (existing) bug that came to light when using the record name Builder. - Reduced the number of whitespace changes as much as possible. Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Labels: java Attachments: AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch, AVRO-1614-20141204-v4.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AVRO-1619) Generate better JavaDoc
Niels Basjes created AVRO-1619: -- Summary: Generate better JavaDoc Key: AVRO-1619 URL: https://issues.apache.org/jira/browse/AVRO-1619 Project: Avro Issue Type: Improvement Components: java Affects Versions: 1.7.7 Reporter: Niels Basjes Assume the following IDL snippet: {code} @namespace(nl.basjes.avro.test) protocol Something { record MyRecord { /** The time (epoch in milliseconds since 1970-01-01) */ longtimestamp; } } {code} The currently generated java code looks like this: {code} /** * Gets the value of the 'timestamp' field. * The time (epoch in milliseconds since 1970-01-01) when the event occurred */ public java.lang.Long getTimestamp() { return timestamp; } /** * Sets the value of the 'timestamp' field. * The time (epoch in milliseconds since 1970-01-01) when the event occurred * @param value the value to set. */ public void setTimestamp(java.lang.Long value) { this.timestamp = value; } {code} Because the @param is not on a new line this is not shown in my IDE (IntelliJ 14) as a parameter. In addition the getters and setters within the Builder are missing these comments and the @param completely. {code} /** Gets the value of the 'timestamp' field */ public java.lang.Long getTimestamp() { return timestamp; } /** Sets the value of the 'timestamp' field */ public nl.basjes.avro.test.MyRecord.Builder setTimestamp(long value) { validate(fields()[0], value); this.timestamp = value; fieldSetFlags()[0] = true; return this; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1619) Generate better JavaDoc
[ https://issues.apache.org/jira/browse/AVRO-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1619: --- Attachment: AVRO-1619-2014-12-11-v1.patch Unfortunately changing the comments also means massive changes in the two Player.java files. Generate better JavaDoc --- Key: AVRO-1619 URL: https://issues.apache.org/jira/browse/AVRO-1619 Project: Avro Issue Type: Improvement Components: java Affects Versions: 1.7.7 Reporter: Niels Basjes Attachments: AVRO-1619-2014-12-11-v1.patch Assume the following IDL snippet: {code} @namespace(nl.basjes.avro.test) protocol Something { record MyRecord { /** The time (epoch in milliseconds since 1970-01-01) */ longtimestamp; } } {code} The currently generated java code looks like this: {code} /** * Gets the value of the 'timestamp' field. * The time (epoch in milliseconds since 1970-01-01) when the event occurred */ public java.lang.Long getTimestamp() { return timestamp; } /** * Sets the value of the 'timestamp' field. * The time (epoch in milliseconds since 1970-01-01) when the event occurred * @param value the value to set. */ public void setTimestamp(java.lang.Long value) { this.timestamp = value; } {code} Because the @param is not on a new line this is not shown in my IDE (IntelliJ 14) as a parameter. In addition the getters and setters within the Builder are missing these comments and the @param completely. {code} /** Gets the value of the 'timestamp' field */ public java.lang.Long getTimestamp() { return timestamp; } /** Sets the value of the 'timestamp' field */ public nl.basjes.avro.test.MyRecord.Builder setTimestamp(long value) { validate(fields()[0], value); this.timestamp = value; fieldSetFlags()[0] = true; return this; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1619) Generate better JavaDoc
[ https://issues.apache.org/jira/browse/AVRO-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1619: --- Status: Patch Available (was: Open) Generate better JavaDoc --- Key: AVRO-1619 URL: https://issues.apache.org/jira/browse/AVRO-1619 Project: Avro Issue Type: Improvement Components: java Affects Versions: 1.7.7 Reporter: Niels Basjes Attachments: AVRO-1619-2014-12-11-v1.patch Assume the following IDL snippet: {code} @namespace(nl.basjes.avro.test) protocol Something { record MyRecord { /** The time (epoch in milliseconds since 1970-01-01) */ longtimestamp; } } {code} The currently generated java code looks like this: {code} /** * Gets the value of the 'timestamp' field. * The time (epoch in milliseconds since 1970-01-01) when the event occurred */ public java.lang.Long getTimestamp() { return timestamp; } /** * Sets the value of the 'timestamp' field. * The time (epoch in milliseconds since 1970-01-01) when the event occurred * @param value the value to set. */ public void setTimestamp(java.lang.Long value) { this.timestamp = value; } {code} Because the @param is not on a new line this is not shown in my IDE (IntelliJ 14) as a parameter. In addition the getters and setters within the Builder are missing these comments and the @param completely. {code} /** Gets the value of the 'timestamp' field */ public java.lang.Long getTimestamp() { return timestamp; } /** Sets the value of the 'timestamp' field */ public nl.basjes.avro.test.MyRecord.Builder setTimestamp(long value) { validate(fields()[0], value); this.timestamp = value; fieldSetFlags()[0] = true; return this; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Status: Open (was: Patch Available) Patch no longer merges after recent commits. Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Labels: java Attachments: AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch, AVRO-1614-20141204-v4.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Attachment: AVRO-1614-2014-12-16-v5.patch - Patch updated so it merges again. - Minor layout / javadoc tweaks compared to previous version. Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Labels: java Attachments: AVRO-1614-2014-12-16-v5.patch, AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch, AVRO-1614-20141204-v4.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1614) Always getting a value...
[ https://issues.apache.org/jira/browse/AVRO-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1614: --- Status: Patch Available (was: Open) [~cutting] Updated patch Always getting a value... - Key: AVRO-1614 URL: https://issues.apache.org/jira/browse/AVRO-1614 Project: Avro Issue Type: New Feature Components: java Reporter: Niels Basjes Labels: java Attachments: AVRO-1614-2014-12-16-v5.patch, AVRO-1614-20141027-v1.patch, AVRO-1614-20141201-v2.patch, AVRO-1614-20141202-v3.patch, AVRO-1614-20141204-v4.patch Sometimes the Avro structure becomes deeply nested. If in such a scenario you want to be able to set a specific value deep in this tree you want to do this: {code} public void setSomething(String value) { myStruct .getFoo() .getBar() .getOne() .getOther() .setSomething(value); } {code} The 'problem' I ran into is that any of the 4 get methods can return a null value so the code I have to write is really huge. For every step in this method I have to build null checks and create the underlying instance if it is null. I already started writing helper methods to do this for parts of my tree. To solve this in a way that makes this code readable I came up with the following which I want to propose to you guys (before I start working on a patch). My idea is to generate a new 'get' method in addition to the existing normal get method for the regular instance of the class. So in addition to the {code} public Foo getFoo() { return foo; } {code} I propose to generate something like this as well in the cases where this is a type of structure that you may want to traverse as shown in the example. {code} public Foo getAlwaysFoo() { if (foo == null) { setFoo(Foo.newBuilder().build()); } return foo; } {code} This way the automatically created instance immediately has all the defaults I have defined. Assuming this naming my code will be readable because it will look like this: {code} public void setSomething(String value) { myStruct .getAlwaysFoo() .getAlwaysBar() .getAlwaysOne() .getAlwaysOther() .setSomething(value); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment
[ https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248477#comment-14248477 ] Niels Basjes commented on AVRO-1537: I'm just reading up on Docker and just had an Idea: What if instead of doing {code}# Clone Avro RUN git clone http://git.apache.org/avro.git RUN svn checkout https://svn.apache.org/repos/asf/avro/trunk/ avro-trunk{code} You used something like (warning: I just read about this command; I didn't have time yet to try it out) {code} ONBUILD ADD . avro {code} See: https://docs.docker.com/reference/builder/#onbuild That would add your local version of Avro (which will most likely contain the change you're working on) to the container. In that scenario you would be able to run all tests for all languages/version on the patched code before submitting it. Make it easier to set up a multi-language build environment --- Key: AVRO-1537 URL: https://issues.apache.org/jira/browse/AVRO-1537 Project: Avro Issue Type: Improvement Reporter: Martin Kleppmann Attachments: AVRO-1537.patch, AVRO-1537.patch It's currently quite tedious to set up an environment in which the Avro test suites for all supported languages can be run, and in which release candidates can be built. This is especially so when we need to test against several different versions of a programming language or VM (e.g. JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1). Our shared Hudson server isn't an ideal solution, because it only runs tests on changes that are already committed, and maintenance of the server can't easily be shared across the community. I think a Docker image might be a good solution, since it could be set up by one person, shared with all Avro developers, and maintained by the community on an ongoing basis. But other VM solutions (Vagrant, for example?) might work just as well. Suggestions welcome. Related resources: * Using AWS (setting up an EC2 instance for Avro build and release): https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease * Testing multiple versions of Ruby in CI: AVRO-1515 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment
[ https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248516#comment-14248516 ] Niels Basjes commented on AVRO-1537: Perhaps using this option with docker run is better: {code} -v $(pwd):/somewhere/avro {code} Make it easier to set up a multi-language build environment --- Key: AVRO-1537 URL: https://issues.apache.org/jira/browse/AVRO-1537 Project: Avro Issue Type: Improvement Reporter: Martin Kleppmann Attachments: AVRO-1537.patch, AVRO-1537.patch It's currently quite tedious to set up an environment in which the Avro test suites for all supported languages can be run, and in which release candidates can be built. This is especially so when we need to test against several different versions of a programming language or VM (e.g. JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1). Our shared Hudson server isn't an ideal solution, because it only runs tests on changes that are already committed, and maintenance of the server can't easily be shared across the community. I think a Docker image might be a good solution, since it could be set up by one person, shared with all Avro developers, and maintained by the community on an ongoing basis. But other VM solutions (Vagrant, for example?) might work just as well. Suggestions welcome. Related resources: * Using AWS (setting up an EC2 instance for Avro build and release): https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease * Testing multiple versions of Ruby in CI: AVRO-1515 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment
[ https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249839#comment-14249839 ] Niels Basjes commented on AVRO-1537: Hi [~tomwhite], I played around with the last version you uploaded. (Apparently I wasn't looking at the last version when I wrote the previous comment). What I changed here locally is that I removed the last 3 lines from your Dockerfile (i.e. the git clone and svn checkout) and changed the build.sh like below. Now you can do a simple {code}./build.sh safetest{code} and validate the entire build for all languages without any problems of needing to install anything tricky. If I want to run a single command manually from within that environment I can simply do {code}./build.sh shell{code} and get a shell in the docker container. Let me know what you think of this. {code} diff --git a/build.sh b/build.sh index 06961c0..cff8dad 100755 --- a/build.sh +++ b/build.sh @@ -17,12 +17,16 @@ set -e # exit on error -cd `dirname $0`# connect to root +SOURCEDIR=$( cd $( dirname ${BASH_SOURCE[0]} ) pwd ) +cd ${SOURCEDIR} # connect to root VERSION=`cat share/VERSION.txt` +# The cryptic name is to avoid conflicts in case of a shared development system (like jenkins) +DOCKER_IMAGE_NAME=avro_build_image_$(md5sum ${SOURCEDIR}/docker/Dockerfile | cut -d' ' -f1) + function usage { - echo Usage: $0 {test|dist|sign|clean} + echo Usage: $0 {test|safetest|shell|dist|sign|clean} exit 1 } @@ -31,6 +35,13 @@ then usage fi +function runindocker { +docker build -t ${DOCKER_IMAGE_NAME} ${SOURCEDIR}/docker +# By mapping the .m2 directory you can do an mvn install from within the container and use the result on your normal system. +# And this also is a significant speedup in subsequent builds because the dependencies are downloaded only once. +docker run --rm=true -t -i -v ${SOURCEDIR}:/root/avro -w /root/avro -v ${HOME}/.m2:/root/.m2 ${DOCKER_IMAGE_NAME} $1 $2 $3 $4 $5 $6 +} + set -x # echo commands for target in $@ @@ -72,7 +83,14 @@ case $target in (cd lang/java; mvn package -DskipTests) # run interop rpc test /bin/bash share/test/interop/bin/test_rpc_interop.sh + ;; + +safetest) + runindocker ./build.sh test + ;; +shell) + runindocker /bin/bash ;; dist) {code} Make it easier to set up a multi-language build environment --- Key: AVRO-1537 URL: https://issues.apache.org/jira/browse/AVRO-1537 Project: Avro Issue Type: Improvement Reporter: Martin Kleppmann Attachments: AVRO-1537.patch, AVRO-1537.patch It's currently quite tedious to set up an environment in which the Avro test suites for all supported languages can be run, and in which release candidates can be built. This is especially so when we need to test against several different versions of a programming language or VM (e.g. JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1). Our shared Hudson server isn't an ideal solution, because it only runs tests on changes that are already committed, and maintenance of the server can't easily be shared across the community. I think a Docker image might be a good solution, since it could be set up by one person, shared with all Avro developers, and maintained by the community on an ongoing basis. But other VM solutions (Vagrant, for example?) might work just as well. Suggestions welcome. Related resources: * Using AWS (setting up an EC2 instance for Avro build and release): https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease * Testing multiple versions of Ruby in CI: AVRO-1515 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment
[ https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14251528#comment-14251528 ] Niels Basjes commented on AVRO-1537: Hi [~tomwhite], I verified on my system but running the scipt for the first time takes 'ages' and running it the second time I see the command prompt within two seconds. So if you get very different results I suspect a problem on your system. I agree that the 'safetest' name is a poor choice. Perhaps simply having the {{./build.sh shell}} is good enough for the first version? I also noticed that all files in the avro directory become owned by {{root}} because that is the default user inside docker. So after exiting the docker environment I could not longer do a {{mvn clean}} because the generated files (.class, .jar, etc) are all stored in directories owned by the root user. In an attempt to solve this I came up with this. I effectively create an additional layer that simply recreates the current user (same username, userid, groupid) in the docker environment. {code} function runindocker { docker build -t ${DOCKER_IMAGE_NAME} ${SOURCEDIR}/docker docker build -t ${DOCKER_IMAGE_NAME}_${USER} - UserSpecificDocker FROM ${DOCKER_IMAGE_NAME} RUN groupadd -g $(id -g) ${USER} RUN useradd -g $(id -g) -u $(id -u) -k /root -m ${USER} ENV HOME /home/${USER} UserSpecificDocker # By mapping the .m2 directory you can do an mvn install from within the container and use the result on your normal system. # And this also is a significant speedup in subsequent builds because the dependencies are downloaded only once. docker run --rm=true -t -i -v ${SOURCEDIR}:/home/${USER}/avro -w /home/${USER}/avro -v ${HOME}/.m2:/home/${USER}/.m2 -u ${USER} ${DOCKER_IMAGE_NAME}_${USER} $1 $2 $3 $4 $5 $6 } {code} Make it easier to set up a multi-language build environment --- Key: AVRO-1537 URL: https://issues.apache.org/jira/browse/AVRO-1537 Project: Avro Issue Type: Improvement Reporter: Martin Kleppmann Assignee: Tom White Attachments: AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch It's currently quite tedious to set up an environment in which the Avro test suites for all supported languages can be run, and in which release candidates can be built. This is especially so when we need to test against several different versions of a programming language or VM (e.g. JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1). Our shared Hudson server isn't an ideal solution, because it only runs tests on changes that are already committed, and maintenance of the server can't easily be shared across the community. I think a Docker image might be a good solution, since it could be set up by one person, shared with all Avro developers, and maintained by the community on an ongoing basis. But other VM solutions (Vagrant, for example?) might work just as well. Suggestions welcome. Related resources: * Using AWS (setting up an EC2 instance for Avro build and release): https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease * Testing multiple versions of Ruby in CI: AVRO-1515 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment
[ https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253656#comment-14253656 ] Niels Basjes commented on AVRO-1537: Hi [~tomwhite], I checked your current patch and I really like it. I did a test on CentOS 6.5 x86_64 inside a Virtual Box in Windows 7 and this works really great. The first build takes a while and then subsequent runs are within a second on the screen. The Java build went perfect and I let it run for a while and it all seems to work fine (until it all stops with the known bug {{./build.sh: line 25: node_modules/grunt/bin/grunt: No such file or directory}}). A few feedback points: # Using the naming like {{avro-build}} and {{avro-build-$USER}} means that there can only be a single images at a time with that name. So in the scenario where you want to enhance the image (i.e. edit the Dockerfile) and then a different user on the same system cannot use the build system reliably; there will be conflicts between those two users. I realize this is an extremely unlikely scenario in normal development situations. I'm sure not how unlikely it becomes when we make this docker part of the normal build and run it on a shared CI environment (i.e. Jenkins) where everything runs as the same user. I think it is fine to do this form as long as we realize (and preferably document) this caveat. Perhaps something a comment as simple as The build.sh docker environment is intended for use on personal development systems. should suffice. # The Dockerfile does the installation of the various things by means of naming the tool that needs to be installed. Then this docker image will be different if for one of the tools a new version is released. So I was wondering; ## What happens if in the future one of those tools creates a regression or a incompatible change in their API? Like the PHP NaN example? Is that a good thing because you're actually building and testing against what end users will be using too? Or is this a bad thing because you have a changing build environment? ## What happens if in the future one of those tools is updated and it is an update that is really needed for the build to succeed? How can it be validated/ensured that the image is updated too on the desktop of the developers? Perhaps the project can be enhanced to ensure/enforce minimal versions of the tools that are needed? # I vote you push back the improvements towards HBase where you based the original on (at least make a ticket in HBase {{Evaluate Docker improvements from AVRO project}}). Make it easier to set up a multi-language build environment --- Key: AVRO-1537 URL: https://issues.apache.org/jira/browse/AVRO-1537 Project: Avro Issue Type: Improvement Reporter: Martin Kleppmann Assignee: Tom White Attachments: AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch It's currently quite tedious to set up an environment in which the Avro test suites for all supported languages can be run, and in which release candidates can be built. This is especially so when we need to test against several different versions of a programming language or VM (e.g. JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1). Our shared Hudson server isn't an ideal solution, because it only runs tests on changes that are already committed, and maintenance of the server can't easily be shared across the community. I think a Docker image might be a good solution, since it could be set up by one person, shared with all Avro developers, and maintained by the community on an ongoing basis. But other VM solutions (Vagrant, for example?) might work just as well. Suggestions welcome. Related resources: * Using AWS (setting up an EC2 instance for Avro build and release): https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease * Testing multiple versions of Ruby in CI: AVRO-1515 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AVRO-1624) Surefire forkMode is deprecated
Niels Basjes created AVRO-1624: -- Summary: Surefire forkMode is deprecated Key: AVRO-1624 URL: https://issues.apache.org/jira/browse/AVRO-1624 Project: Avro Issue Type: Bug Components: java, trevni Reporter: Niels Basjes During a java build I see this warning message several times: {code} [INFO] --- maven-surefire-plugin:2.17:test (default-test) @ avro --- [WARNING] The parameter forkMode is deprecated since version 2.14. Use forkCount and reuseForks instead. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1624) Surefire forkMode is deprecated
[ https://issues.apache.org/jira/browse/AVRO-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253673#comment-14253673 ] Niels Basjes commented on AVRO-1624: Other projects have also gotten this: https://github.com/apache/camel/pull/331/files Surefire forkMode is deprecated --- Key: AVRO-1624 URL: https://issues.apache.org/jira/browse/AVRO-1624 Project: Avro Issue Type: Bug Components: java, trevni Reporter: Niels Basjes During a java build I see this warning message several times: {code} [INFO] --- maven-surefire-plugin:2.17:test (default-test) @ avro --- [WARNING] The parameter forkMode is deprecated since version 2.14. Use forkCount and reuseForks instead. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1624) Surefire forkMode is deprecated
[ https://issues.apache.org/jira/browse/AVRO-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1624: --- Attachment: AVRO-1624-2014-12-19-v1.patch {code} $ find . -type f -print0 | xargs -0 fgrep -i forkMode ./lang/java/pom.xml:forkModealways/forkMode ./lang/java/pom.xml: forkModeonce/forkMode {code} http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html {quote} |Old Setting|New Setting| |forkMode=once (default)| forkCount=1 (default), reuseForks=true (default)| |forkMode=always|forkCount=1 (default), reuseForks=false| {quote} Surefire forkMode is deprecated --- Key: AVRO-1624 URL: https://issues.apache.org/jira/browse/AVRO-1624 Project: Avro Issue Type: Bug Components: java, trevni Reporter: Niels Basjes Attachments: AVRO-1624-2014-12-19-v1.patch During a java build I see this warning message several times: {code} [INFO] --- maven-surefire-plugin:2.17:test (default-test) @ avro --- [WARNING] The parameter forkMode is deprecated since version 2.14. Use forkCount and reuseForks instead. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1624) Surefire forkMode is deprecated
[ https://issues.apache.org/jira/browse/AVRO-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1624: --- Status: Patch Available (was: Open) Surefire forkMode is deprecated --- Key: AVRO-1624 URL: https://issues.apache.org/jira/browse/AVRO-1624 Project: Avro Issue Type: Bug Components: java, trevni Reporter: Niels Basjes Attachments: AVRO-1624-2014-12-19-v1.patch During a java build I see this warning message several times: {code} [INFO] --- maven-surefire-plugin:2.17:test (default-test) @ avro --- [WARNING] The parameter forkMode is deprecated since version 2.14. Use forkCount and reuseForks instead. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment
[ https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254113#comment-14254113 ] Niels Basjes commented on AVRO-1537: On Linux docker does not _require_ sudo. In fact the docker manual says this: https://docs.docker.com/installation/ubuntulinux/#giving-non-root-access {quote} Starting in version 0.5.3, if you (or your Docker installer) create a Unix group called docker and add users to it, then the docker daemon will make the ownership of the Unix socket read/writable by the docker group when the daemon starts. The docker daemon must always run as the root user, but if you run the docker client as a user in the docker group then you don't need to add sudo to all the client commands. {quote} I always set it up like this (i.e. adding my user to the docker group) and this means that the $SUDO_USER is always unset. So to make both situations work I expect we should have something that only overwrites $USER with $SUDO_USER if it was set. Make it easier to set up a multi-language build environment --- Key: AVRO-1537 URL: https://issues.apache.org/jira/browse/AVRO-1537 Project: Avro Issue Type: Improvement Reporter: Martin Kleppmann Assignee: Tom White Attachments: AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch It's currently quite tedious to set up an environment in which the Avro test suites for all supported languages can be run, and in which release candidates can be built. This is especially so when we need to test against several different versions of a programming language or VM (e.g. JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1). Our shared Hudson server isn't an ideal solution, because it only runs tests on changes that are already committed, and maintenance of the server can't easily be shared across the community. I think a Docker image might be a good solution, since it could be set up by one person, shared with all Avro developers, and maintained by the community on an ongoing basis. But other VM solutions (Vagrant, for example?) might work just as well. Suggestions welcome. Related resources: * Using AWS (setting up an EC2 instance for Avro build and release): https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease * Testing multiple versions of Ruby in CI: AVRO-1515 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment
[ https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254963#comment-14254963 ] Niels Basjes commented on AVRO-1537: I just realized that the original issue states {quote}This is especially so when we need to test against several different versions of a programming language or VM (e.g. JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1).{quote} What we have created so far is actually a good way of setting up the 'default' build environment (which is a very good first step). Any idea's how we're going to approach the 'real question' ? Make it easier to set up a multi-language build environment --- Key: AVRO-1537 URL: https://issues.apache.org/jira/browse/AVRO-1537 Project: Avro Issue Type: Improvement Reporter: Martin Kleppmann Assignee: Tom White Attachments: AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch It's currently quite tedious to set up an environment in which the Avro test suites for all supported languages can be run, and in which release candidates can be built. This is especially so when we need to test against several different versions of a programming language or VM (e.g. JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1). Our shared Hudson server isn't an ideal solution, because it only runs tests on changes that are already committed, and maintenance of the server can't easily be shared across the community. I think a Docker image might be a good solution, since it could be set up by one person, shared with all Avro developers, and maintained by the community on an ongoing basis. But other VM solutions (Vagrant, for example?) might work just as well. Suggestions welcome. Related resources: * Using AWS (setting up an EC2 instance for Avro build and release): https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease * Testing multiple versions of Ruby in CI: AVRO-1515 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1630) Creating Builder from instance loses data
[ https://issues.apache.org/jira/browse/AVRO-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303019#comment-14303019 ] Niels Basjes commented on AVRO-1630: @[~tomwhite] What else is needed to get this committed? Creating Builder from instance loses data - Key: AVRO-1630 URL: https://issues.apache.org/jira/browse/AVRO-1630 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.8.0 Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 1.8.0 Attachments: AVRO-1630-2015-01-14-v1.patch If you create a builder from an instance and then use the .getXxxBuilder() method you get an empty builder instead of a builder than contains the data elements from the original instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1630) Creating Builder from instance loses data
[ https://issues.apache.org/jira/browse/AVRO-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277847#comment-14277847 ] Niels Basjes commented on AVRO-1630: I consider this a usage scenario I overlooked when writing AVRO-1614 [~cutting] / [~tomwhite] Can you guys please double check I haven't missed anything? Creating Builder from instance loses data - Key: AVRO-1630 URL: https://issues.apache.org/jira/browse/AVRO-1630 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.8.0 Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 1.8.0 Attachments: AVRO-1630-2015-01-14-v1.patch If you create a builder from an instance and then use the .getXxxBuilder() method you get an empty builder instead of a builder than contains the data elements from the original instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1633) Add additional setXxx(Builder) method to make user code more readable.
[ https://issues.apache.org/jira/browse/AVRO-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1633: --- Description: The currently generated code contains these two methods for the Builder instances (code sample was simplified): {code}public Request.Builder setConnection(NetworkConnection value) public Request.Builder setConnectionBuilder(NetworkConnection.Builder value){code} My proposal: Add in addition the method: {code}public Request.Builder setConnection(NetworkConnection.Builder value){code} Advantage: You can do {{.setConnection(something)}} and pass either a {{NetworkConnection}} or a {{NetworkConnection.Builder}}. Disadvantage: Explicitly setting a {{null}} will trigger a Multiple implementations error and as such will need an explicit typecast. This may be considered breaking backward compatibility! was: The currently generated code contains these two methods for the Builder instances (code sample was simplified): {code}public Request.Builder setConnection(NetworkConnection value) public Request.Builder setConnectionBuilder(NetworkConnection.Builder value){code} My proposal: Add in addition the method: {code}public Request.Builder setConnection(NetworkConnection.Builder value){code} Advantage: You can do {{.setConnection(something)}} and pass either a {{NetworkConnection}} or a {{NetworkConnection.Builder}}. Disadvantage: Explicitly setting a {{null}} will trigger a Multiple implementations error and as such will need an explicit typecast. Add additional setXxx(Builder) method to make user code more readable. -- Key: AVRO-1633 URL: https://issues.apache.org/jira/browse/AVRO-1633 Project: Avro Issue Type: Improvement Components: java Affects Versions: 1.8.0 Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 1.8.0 Attachments: AVRO-1633-2015-01-20-v1.patch The currently generated code contains these two methods for the Builder instances (code sample was simplified): {code}public Request.Builder setConnection(NetworkConnection value) public Request.Builder setConnectionBuilder(NetworkConnection.Builder value){code} My proposal: Add in addition the method: {code}public Request.Builder setConnection(NetworkConnection.Builder value){code} Advantage: You can do {{.setConnection(something)}} and pass either a {{NetworkConnection}} or a {{NetworkConnection.Builder}}. Disadvantage: Explicitly setting a {{null}} will trigger a Multiple implementations error and as such will need an explicit typecast. This may be considered breaking backward compatibility! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1633) Add additional setXxx(Builder) method to make user code more readable.
[ https://issues.apache.org/jira/browse/AVRO-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1633: --- Description: The currently generated code contains these two methods for the Builder instances (code sample was simplified): {code}public Request.Builder setConnection(NetworkConnection value) public Request.Builder setConnectionBuilder(NetworkConnection.Builder value){code} My proposal: Add in addition the method: {code}public Request.Builder setConnection(NetworkConnection.Builder value){code} Advantage: * You can do {{.setConnection(something)}} and pass either a {{NetworkConnection}} or a {{NetworkConnection.Builder}}. * User code becomes a bit more readable. Disadvantages: * Explicitly setting a {{null}} will trigger a Multiple implementations error and as such will need an explicit typecast. ** _This may be considered breaking backward compatibility!_ To solve this you must do either this: {code}.setConnection((NetworkConnection)null){code} or this {code} NetworkConnection nc = null; ... .setConnection(nc){code} was: The currently generated code contains these two methods for the Builder instances (code sample was simplified): {code}public Request.Builder setConnection(NetworkConnection value) public Request.Builder setConnectionBuilder(NetworkConnection.Builder value){code} My proposal: Add in addition the method: {code}public Request.Builder setConnection(NetworkConnection.Builder value){code} Advantage: You can do {{.setConnection(something)}} and pass either a {{NetworkConnection}} or a {{NetworkConnection.Builder}}. Disadvantage: Explicitly setting a {{null}} will trigger a Multiple implementations error and as such will need an explicit typecast. This may be considered breaking backward compatibility! Add additional setXxx(Builder) method to make user code more readable. -- Key: AVRO-1633 URL: https://issues.apache.org/jira/browse/AVRO-1633 Project: Avro Issue Type: Improvement Components: java Affects Versions: 1.8.0 Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 1.8.0 Attachments: AVRO-1633-2015-01-20-v1.patch The currently generated code contains these two methods for the Builder instances (code sample was simplified): {code}public Request.Builder setConnection(NetworkConnection value) public Request.Builder setConnectionBuilder(NetworkConnection.Builder value){code} My proposal: Add in addition the method: {code}public Request.Builder setConnection(NetworkConnection.Builder value){code} Advantage: * You can do {{.setConnection(something)}} and pass either a {{NetworkConnection}} or a {{NetworkConnection.Builder}}. * User code becomes a bit more readable. Disadvantages: * Explicitly setting a {{null}} will trigger a Multiple implementations error and as such will need an explicit typecast. ** _This may be considered breaking backward compatibility!_ To solve this you must do either this: {code}.setConnection((NetworkConnection)null){code} or this {code} NetworkConnection nc = null; ... .setConnection(nc){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1633) Add additional setXxx(Builder) method to make user code more readable.
[ https://issues.apache.org/jira/browse/AVRO-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1633: --- Status: Patch Available (was: Open) Hi [~cutting], Please classify this as either _wanted_ or _unwanted_ because of this backward compatibility point when setting a 'null' directly. Thanks. Add additional setXxx(Builder) method to make user code more readable. -- Key: AVRO-1633 URL: https://issues.apache.org/jira/browse/AVRO-1633 Project: Avro Issue Type: Improvement Components: java Affects Versions: 1.8.0 Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 1.8.0 Attachments: AVRO-1633-2015-01-20-v1.patch The currently generated code contains these two methods for the Builder instances (code sample was simplified): {code}public Request.Builder setConnection(NetworkConnection value) public Request.Builder setConnectionBuilder(NetworkConnection.Builder value){code} My proposal: Add in addition the method: {code}public Request.Builder setConnection(NetworkConnection.Builder value){code} Advantage: * You can do {{.setConnection(something)}} and pass either a {{NetworkConnection}} or a {{NetworkConnection.Builder}}. * User code becomes a bit more readable. Disadvantages: * Explicitly setting a {{null}} will trigger a Multiple implementations error and as such will need an explicit typecast. ** _This may be considered breaking backward compatibility!_ To solve this you must do either this: {code}.setConnection((NetworkConnection)null){code} or this {code} NetworkConnection nc = null; ... .setConnection(nc){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1633) Add additional setXxx(Builder) method to make user code more readable.
[ https://issues.apache.org/jira/browse/AVRO-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1633: --- Attachment: AVRO-1633-2015-01-20-v1.patch The patch that implements the proposal. Add additional setXxx(Builder) method to make user code more readable. -- Key: AVRO-1633 URL: https://issues.apache.org/jira/browse/AVRO-1633 Project: Avro Issue Type: Improvement Components: java Affects Versions: 1.8.0 Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 1.8.0 Attachments: AVRO-1633-2015-01-20-v1.patch The currently generated code contains these two methods for the Builder instances (code sample was simplified): {code}public Request.Builder setConnection(NetworkConnection value) public Request.Builder setConnectionBuilder(NetworkConnection.Builder value){code} My proposal: Add in addition the method: {code}public Request.Builder setConnection(NetworkConnection.Builder value){code} Advantage: You can do {{.setConnection(something)}} and pass either a {{NetworkConnection}} or a {{NetworkConnection.Builder}}. Disadvantage: Explicitly setting a {{null}} will trigger a Multiple implementations error and as such will need an explicit typecast. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AVRO-1630) Creating Builder from instance loses data
Niels Basjes created AVRO-1630: -- Summary: Creating Builder from instance loses data Key: AVRO-1630 URL: https://issues.apache.org/jira/browse/AVRO-1630 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.8.0 Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 1.8.0 If you create a builder from an instance and then use the .getXxxBuilder() method you get an empty builder instead of a builder than contains the data elements from the original instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1630) Creating Builder from instance loses data
[ https://issues.apache.org/jira/browse/AVRO-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1630: --- Status: Patch Available (was: Open) Creating Builder from instance loses data - Key: AVRO-1630 URL: https://issues.apache.org/jira/browse/AVRO-1630 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.8.0 Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 1.8.0 Attachments: AVRO-1630-2015-01-14-v1.patch If you create a builder from an instance and then use the .getXxxBuilder() method you get an empty builder instead of a builder than contains the data elements from the original instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1630) Creating Builder from instance loses data
[ https://issues.apache.org/jira/browse/AVRO-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1630: --- Attachment: AVRO-1630-2015-01-14-v1.patch This fix changes the getXxxBuilder method from this {code} public org.apache.avro.test.http.NetworkConnection.Builder getConnectionBuilder() { if (connectionBuilder == null) { setConnectionBuilder(org.apache.avro.test.http.NetworkConnection.newBuilder()); } return connectionBuilder; } {code} into this {code} public org.apache.avro.test.http.NetworkConnection.Builder getConnectionBuilder() { if (connectionBuilder == null) { if (hasConnection()) { setConnectionBuilder(org.apache.avro.test.http.NetworkConnection.newBuilder(connection)); } else { setConnectionBuilder(org.apache.avro.test.http.NetworkConnection.newBuilder()); } } return connectionBuilder; } {code} Creating Builder from instance loses data - Key: AVRO-1630 URL: https://issues.apache.org/jira/browse/AVRO-1630 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.8.0 Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 1.8.0 Attachments: AVRO-1630-2015-01-14-v1.patch If you create a builder from an instance and then use the .getXxxBuilder() method you get an empty builder instead of a builder than contains the data elements from the original instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1537) Make it easier to set up a multi-language build environment
[ https://issues.apache.org/jira/browse/AVRO-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14272447#comment-14272447 ] Niels Basjes commented on AVRO-1537: +1 (non-binding) Make it easier to set up a multi-language build environment --- Key: AVRO-1537 URL: https://issues.apache.org/jira/browse/AVRO-1537 Project: Avro Issue Type: Improvement Reporter: Martin Kleppmann Assignee: Tom White Attachments: AVRO-1537-2015-01-08.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch, AVRO-1537.patch It's currently quite tedious to set up an environment in which the Avro test suites for all supported languages can be run, and in which release candidates can be built. This is especially so when we need to test against several different versions of a programming language or VM (e.g. JDK6/JDK7/JDK8, Ruby 1.8.7/1.9.3/2.0/2.1). Our shared Hudson server isn't an ideal solution, because it only runs tests on changes that are already committed, and maintenance of the server can't easily be shared across the community. I think a Docker image might be a good solution, since it could be set up by one person, shared with all Avro developers, and maintained by the community on an ongoing basis. But other VM solutions (Vagrant, for example?) might work just as well. Suggestions welcome. Related resources: * Using AWS (setting up an EC2 instance for Avro build and release): https://cwiki.apache.org/confluence/display/AVRO/How+To+Release#HowToRelease-UsingAWSforAvroBuildandRelease * Testing multiple versions of Ruby in CI: AVRO-1515 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1633) Add additional setXxx(Builder) method to make user code more readable.
[ https://issues.apache.org/jira/browse/AVRO-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521500#comment-14521500 ] Niels Basjes commented on AVRO-1633: For me having such polymorphism in the API makes the API easier to use and increases code readability. Add additional setXxx(Builder) method to make user code more readable. -- Key: AVRO-1633 URL: https://issues.apache.org/jira/browse/AVRO-1633 Project: Avro Issue Type: Improvement Components: java Affects Versions: 1.8.0 Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 1.8.0 Attachments: AVRO-1633-2015-01-20-v1.patch The currently generated code contains these two methods for the Builder instances (code sample was simplified): {code}public Request.Builder setConnection(NetworkConnection value) public Request.Builder setConnectionBuilder(NetworkConnection.Builder value){code} My proposal: Add in addition the method: {code}public Request.Builder setConnection(NetworkConnection.Builder value){code} Advantage: * You can do {{.setConnection(something)}} and pass either a {{NetworkConnection}} or a {{NetworkConnection.Builder}}. * User code becomes a bit more readable. Disadvantages: * Explicitly setting a {{null}} will trigger a Multiple implementations error and as such will need an explicit typecast. ** _This may be considered breaking backward compatibility!_ To solve this you must do either this: {code}.setConnection((NetworkConnection)null){code} or this {code} NetworkConnection nc = null; ... .setConnection(nc){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1633) Add additional setXxx(Builder) method to make user code more readable.
[ https://issues.apache.org/jira/browse/AVRO-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1633: --- Resolution: Won't Fix Status: Resolved (was: Patch Available) Add additional setXxx(Builder) method to make user code more readable. -- Key: AVRO-1633 URL: https://issues.apache.org/jira/browse/AVRO-1633 Project: Avro Issue Type: Improvement Components: java Affects Versions: 1.8.0 Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 1.8.0 Attachments: AVRO-1633-2015-01-20-v1.patch The currently generated code contains these two methods for the Builder instances (code sample was simplified): {code}public Request.Builder setConnection(NetworkConnection value) public Request.Builder setConnectionBuilder(NetworkConnection.Builder value){code} My proposal: Add in addition the method: {code}public Request.Builder setConnection(NetworkConnection.Builder value){code} Advantage: * You can do {{.setConnection(something)}} and pass either a {{NetworkConnection}} or a {{NetworkConnection.Builder}}. * User code becomes a bit more readable. Disadvantages: * Explicitly setting a {{null}} will trigger a Multiple implementations error and as such will need an explicit typecast. ** _This may be considered breaking backward compatibility!_ To solve this you must do either this: {code}.setConnection((NetworkConnection)null){code} or this {code} NetworkConnection nc = null; ... .setConnection(nc){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189473#comment-15189473 ] Niels Basjes commented on AVRO-1704: Note that having the "AVRO" prefix will also limit the number of needless calls to the Schema registry when bad records are put into the stream (like the Timer ticks example). > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement >Reporter: Daniel Schierbeck > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189402#comment-15189402 ] Niels Basjes commented on AVRO-1704: I've been looking into what kind of solution would work here since I'm working on a project where we need datastructures going into Kafka and be available to multiple consumers. The fundamental problem we need to solve is that of "Schema Evolution" in a streaming environment (Let's assume Kafka with the built in persistence of records). We need three things to make this happen: # A way to recognize a 'blob' is a serialized AVRO record. #* We can simply assume it is always an AVRO record. #* I think we should simply let such a record start with "AVRO" to ensure we can cleanly catch problems like this STORM-512 (Summary: Timer ticks we written into Kafka which caused a lot of deserialization errors in reading the AVRO records.) # A way to determine the schema this was written with. #* As indicated above I vote for using the CRC-64-AVRO. #** I noticed that a simple typo fix in the documentation of a Schema causes a new fingerprint to be generated. #** Proposal: I think we should 'clean' the schema before calculating the fingerprint. I.e. remove the things that do not impact the binary form of the record (like the doc field). # Have a place where we can find the schemas using the fingerprint as the key. #* Here I think (looking at AVRO-1124 and the fact that there are ready to run implementations like this [Schema Registry|http://docs.confluent.io/current/schema-registry/docs/index.html]) we should limit what we keep inside Avro to something like a "SchemaFactory" interface (as the storage/retrieval interface to get a Schema) and a very basic implementation that simply reads the available schema's from a (set of) property file(s). Using this others can write additional implementations that can read/write to things like databases or the above mentioned Schema Registry. So to summarize my proposal on the standard for the {{Single record serialization format}} can be written as: {code}"AVRO"{code} [~rdblue], I'm seeking feedback from you guys on this proposal. > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement >Reporter: Daniel Schierbeck > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190866#comment-15190866 ] Niels Basjes edited comment on AVRO-1704 at 3/11/16 1:00 PM: - Thanks for pointing this out. My updated proposal for this: {code}Avro{code} Where # "version" = 1 byte indicating the version (or "schema") of the rest of the bytes. if version == 0x00 # "Fingerprint" = the CRC-64-AVRO of the Canonical form of the Schema. # "Record" = the record serialized to byte using the existing serialization system. I personally do not like these 'chopped' prefixes if there is no "really good reason to chop them" (like the length). Because the projects name is so short: In this proposal I'm sticking to using the full name of the project as the prefix: "Avro" (i.e. these 4 bytes 0x41, 0x76, 0x72, 0x6F) was (Author: nielsbasjes): Thanks for pointing this out. My updated proposal for this: {code}"Avro"{code} Where # "version" = 1 byte indicating the version (or "schema") of the rest of the bytes. if version == 0x00 # "Fingerprint" = the CRC-64-AVRO of the Canonical form of the Schema. # "Record" = the record serialized to byte using the existing serialization system. I personally do not like these 'chopped' prefixes if there is no "really good reason to chop them" (like the length). Because the projects name is so short: In this proposal I'm sticking to using the full name of the project as the prefix: "Avro" (i.e. these 4 bytes 0x41, 0x76, 0x72, 0x6F) > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement >Reporter: Daniel Schierbeck > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15190866#comment-15190866 ] Niels Basjes commented on AVRO-1704: Thanks for pointing this out. My updated proposal for this: {code}"Avro"{code} Where # "version" = 1 byte indicating the version (or "schema") of the rest of the bytes. if version == 0x00 # "Fingerprint" = the CRC-64-AVRO of the Canonical form of the Schema. # "Record" = the record serialized to byte using the existing serialization system. I personally do not like these 'chopped' prefixes if there is no "really good reason to chop them" (like the length). Because the projects name is so short: In this proposal I'm sticking to using the full name of the project as the prefix: "Avro" (i.e. these 4 bytes 0x41, 0x76, 0x72, 0x6F) > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement >Reporter: Daniel Schierbeck > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238996#comment-15238996 ] Niels Basjes commented on AVRO-1704: I have a first addition: Think about supporting encrytion. > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement >Reporter: Daniel Schierbeck >Assignee: Niels Basjes > Attachments: AVRO-1704-20160410.patch > > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1826) build.sh rat fails over extra license files and many others.
[ https://issues.apache.org/jira/browse/AVRO-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238698#comment-15238698 ] Niels Basjes commented on AVRO-1826: Yes, that is an unintended mistake. I'll fix it and commit this weekend. Thanks > build.sh rat fails over extra license files and many others. > > > Key: AVRO-1826 > URL: https://issues.apache.org/jira/browse/AVRO-1826 > Project: Avro > Issue Type: Bug >Reporter: Niels Basjes >Assignee: Niels Basjes > Attachments: AVRO-1826-20160410.patch > > > When running ./build.sh rat this will fail due to several license related > files we recently added. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes reassigned AVRO-1704: -- Assignee: Niels Basjes > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement >Reporter: Daniel Schierbeck >Assignee: Niels Basjes > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1825) Allow running build.sh dist under git
[ https://issues.apache.org/jira/browse/AVRO-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1825: --- Status: Patch Available (was: Open) > Allow running build.sh dist under git > - > > Key: AVRO-1825 > URL: https://issues.apache.org/jira/browse/AVRO-1825 > Project: Avro > Issue Type: Improvement > Components: build >Reporter: Niels Basjes >Assignee: Niels Basjes > Attachments: AVRO-1825-20160409.patch > > > When working of a git clone instead of an svn checkout the build.sh dist > cannot run due to an explicit dependency on the fact that the working > directory must be an svn checkout. > This should be a bit more flexible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1825) Allow running build.sh dist under git
[ https://issues.apache.org/jira/browse/AVRO-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1825: --- Attachment: AVRO-1825-20160409.patch The patch > Allow running build.sh dist under git > - > > Key: AVRO-1825 > URL: https://issues.apache.org/jira/browse/AVRO-1825 > Project: Avro > Issue Type: Improvement > Components: build >Reporter: Niels Basjes >Assignee: Niels Basjes > Attachments: AVRO-1825-20160409.patch > > > When working of a git clone instead of an svn checkout the build.sh dist > cannot run due to an explicit dependency on the fact that the working > directory must be an svn checkout. > This should be a bit more flexible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AVRO-1825) Allow running build.sh dist under git
Niels Basjes created AVRO-1825: -- Summary: Allow running build.sh dist under git Key: AVRO-1825 URL: https://issues.apache.org/jira/browse/AVRO-1825 Project: Avro Issue Type: Improvement Components: build Reporter: Niels Basjes Assignee: Niels Basjes When working of a git clone instead of an svn checkout the build.sh dist cannot run due to an explicit dependency on the fact that the working directory must be an svn checkout. This should be a bit more flexible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AVRO-1828) Add EditorConfig file
Niels Basjes created AVRO-1828: -- Summary: Add EditorConfig file Key: AVRO-1828 URL: https://issues.apache.org/jira/browse/AVRO-1828 Project: Avro Issue Type: Improvement Reporter: Niels Basjes I was working with Apache Flink last week and they recently implemented http://editorconfig.org/ ( see here https://github.com/apache/flink/blob/master/.editorconfig ) Essentially this is a very simple config file that instructs a great many editors to adhere to the main coding standard choices (things like character encoding, tabs v.s. spaces , newlines, etc) for a specific project on a per file type basis. When someone opens the project in a intelliJ then this will automatically use these settings. Proposal: # We implement this for Avro at the root level with global defaults. # We implement a specific file per language. I think we should start with the top level scripting (like build.sh and pom.xml) and Java as the first language. # We fix the violations of this standard in a single commit per language. Note that if we don't fix those violations then later commits will be 'harder' to keep clean (you will see a lot of unrelated changes) because the IDEs will 'enforce' the standard on all touched files. What do you guys think? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1826) build.sh rat fails over extra licence files.
[ https://issues.apache.org/jira/browse/AVRO-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233994#comment-15233994 ] Niels Basjes commented on AVRO-1826: It also fails over files generated during the build test and dist steps. Even after a build clean many of these remain. > build.sh rat fails over extra licence files. > > > Key: AVRO-1826 > URL: https://issues.apache.org/jira/browse/AVRO-1826 > Project: Avro > Issue Type: Bug >Reporter: Niels Basjes >Assignee: Niels Basjes > > When running ./build.sh rat this will fail due to several license related > files we recently added. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1826) build.sh rat fails over extra license files and many others.
[ https://issues.apache.org/jira/browse/AVRO-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1826: --- Status: Patch Available (was: Open) > build.sh rat fails over extra license files and many others. > > > Key: AVRO-1826 > URL: https://issues.apache.org/jira/browse/AVRO-1826 > Project: Avro > Issue Type: Bug >Reporter: Niels Basjes >Assignee: Niels Basjes > Attachments: AVRO-1826-20160410.patch > > > When running ./build.sh rat this will fail due to several license related > files we recently added. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1704: --- Attachment: AVRO-1704-20160410.patch > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement >Reporter: Daniel Schierbeck >Assignee: Niels Basjes > Attachments: AVRO-1704-20160410.patch > > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'
[ https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1814: --- Attachment: AVRO-1814-20160410.patch Turns out that this problem in essence is a limitation of the way Java does resolution when there are name clashes between packages, classes, etc. This patch at least mitigates the probability of this occurring in user applications. > 1.8 IDL generator broken when containing a field called 'org' > - > > Key: AVRO-1814 > URL: https://issues.apache.org/jira/browse/AVRO-1814 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.8.0 >Reporter: Dustin Spicuzza >Assignee: Niels Basjes > Attachments: AVRO-1814-20160410.patch > > > The problem is in the generated 'readExternal' and 'writeExternal' functions, > because they do something like: > WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out)); > When a member variable called 'org' exists, then the compile fails because > the compiler thinks that 'org' is a member variable and that 'apache cannot > be resolved or is not a field'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1826) build.sh rat fails over extra license files and many others.
[ https://issues.apache.org/jira/browse/AVRO-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1826: --- Attachment: AVRO-1826-20160410.patch After running both ./build.sh test and dist, this will now let the build rat pass. > build.sh rat fails over extra license files and many others. > > > Key: AVRO-1826 > URL: https://issues.apache.org/jira/browse/AVRO-1826 > Project: Avro > Issue Type: Bug >Reporter: Niels Basjes >Assignee: Niels Basjes > Attachments: AVRO-1826-20160410.patch > > > When running ./build.sh rat this will fail due to several license related > files we recently added. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15234304#comment-15234304 ] Niels Basjes edited comment on AVRO-1704 at 4/10/16 10:10 PM: -- During the last few weeks I spent some time figuring out what I think the format should be. I created this patch which includes specification for the new format, code generators for Java and unit tests that validate the format in light of schema evolution and corrupt data. I documented the new format as follows: {quote} Schema tagged Binary Encoding specification The wrapper format consists of a header and a body. The header is always the 4 bytes representing the UTF-8 of the word "Avro" followed by a single byte indicating the version of the body format. Version 0 of the body (currently the ONLY body format that has been defined) consists of: # the finger print (see the section about Schema Fingerprints of the schema (a 64 bit long) that was written in the same byte order as a long is when written if it was a field in a record. # the record serialized to byte using the binary encoding. {quote} Although I think this is already "pretty good" I really think this needs your comments and improvement suggestions. Thanks. was (Author: nielsbasjes): During the last few weeks I spent some time figuring out what I think the format should be. I created this patch which includes specification for the new format, code generators for Java and unit tests that validate the format in light of schema evolution and corrupt data. I documented the new format as follows: {quote} Schema tagged Binary Encoding specification The wrapper format consists of a header and a body. The header is always the 4 bytes representing the UTF-8 of the word "Avro" followed by a single byte indicating the version of the body format. Version 0 of the body (currently the ONLY body format that has been defined) consists of: # the finger print (see the section about Schema Fingerprints of the schema (a 64 bit long) that was written in the same byte order as a long is when written if it was a field in a record. # the record serialized to byte using the binary encoding. {quote} Although I thing this is already "pretty good" I really think this needs your comments and improvement suggestions. Thanks. > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement >Reporter: Daniel Schierbeck >Assignee: Niels Basjes > Attachments: AVRO-1704-20160410.patch > > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1826) build.sh rat fails over extra license files and many others.
[ https://issues.apache.org/jira/browse/AVRO-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1826: --- Summary: build.sh rat fails over extra license files and many others. (was: build.sh rat fails over extra licence files.) > build.sh rat fails over extra license files and many others. > > > Key: AVRO-1826 > URL: https://issues.apache.org/jira/browse/AVRO-1826 > Project: Avro > Issue Type: Bug >Reporter: Niels Basjes >Assignee: Niels Basjes > > When running ./build.sh rat this will fail due to several license related > files we recently added. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1704: --- Status: Patch Available (was: Open) During the last few weeks I spent some time figuring out what I think the format should be. I created this patch which includes specification for the new format, code generators for Java and unit tests that validate the format in light of schema evolution and corrupt data. I documented the new format as follows: {quote} Schema tagged Binary Encoding specification The wrapper format consists of a header and a body. The header is always the 4 bytes representing the UTF-8 of the word "Avro" followed by a single byte indicating the version of the body format. Version 0 of the body (currently the ONLY body format that has been defined) consists of: # the finger print (see the section about Schema Fingerprints of the schema (a 64 bit long) that was written in the same byte order as a long is when written if it was a field in a record. # the record serialized to byte using the binary encoding. {quote} Although I thing this is already "pretty good" I really think this needs your comments and improvement suggestions. Thanks. > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement >Reporter: Daniel Schierbeck >Assignee: Niels Basjes > Attachments: AVRO-1704-20160410.patch > > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'
[ https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1814: --- Status: Patch Available (was: Open) > 1.8 IDL generator broken when containing a field called 'org' > - > > Key: AVRO-1814 > URL: https://issues.apache.org/jira/browse/AVRO-1814 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.8.0 >Reporter: Dustin Spicuzza >Assignee: Niels Basjes > Attachments: AVRO-1814-20160410.patch > > > The problem is in the generated 'readExternal' and 'writeExternal' functions, > because they do something like: > WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out)); > When a member variable called 'org' exists, then the compile fails because > the compiler thinks that 'org' is a member variable and that 'apache cannot > be resolved or is not a field'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AVRO-1826) build.sh rat fails over extra licence files.
Niels Basjes created AVRO-1826: -- Summary: build.sh rat fails over extra licence files. Key: AVRO-1826 URL: https://issues.apache.org/jira/browse/AVRO-1826 Project: Avro Issue Type: Bug Reporter: Niels Basjes Assignee: Niels Basjes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1826) build.sh rat fails over extra licence files.
[ https://issues.apache.org/jira/browse/AVRO-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1826: --- Description: When running ./build.sh rat this will fail due to several license related files we recently added. > build.sh rat fails over extra licence files. > > > Key: AVRO-1826 > URL: https://issues.apache.org/jira/browse/AVRO-1826 > Project: Avro > Issue Type: Bug >Reporter: Niels Basjes >Assignee: Niels Basjes > > When running ./build.sh rat this will fail due to several license related > files we recently added. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1825) Allow running build.sh dist under git
[ https://issues.apache.org/jira/browse/AVRO-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236098#comment-15236098 ] Niels Basjes commented on AVRO-1825: I'll commit this when I get back from the Hadoop Summit (Ireland). I just need to read up on the exact procedure. > Allow running build.sh dist under git > - > > Key: AVRO-1825 > URL: https://issues.apache.org/jira/browse/AVRO-1825 > Project: Avro > Issue Type: Improvement > Components: build >Reporter: Niels Basjes >Assignee: Niels Basjes > Attachments: AVRO-1825-20160409.patch > > > When working of a git clone instead of an svn checkout the build.sh dist > cannot run due to an explicit dependency on the fact that the working > directory must be an svn checkout. > This should be a bit more flexible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210535#comment-15210535 ] Niels Basjes commented on AVRO-1704: I did some experimenting over the last week and I posted my changed version of Avro here: https://github.com/nielsbasjes/avro/tree/AVRO-1704 What I did so far: # Added to Schema the getFingerPrint() method that uses the CRC-64-AVRO to calculate the schema finger print. # Added a few SchemaStorage related classes that allow storing schemas in memory. # Added to the generated classes the toBytes() method and the fromBytes static method. Both effectively call the 'real' implementations which are in the SpecificRecordBase class. All of this passes all of the Java unit testing. At the application end my test code (using 3 slightly different variations of the same schema) looks like this. This works exactly as I expect it to. {code:java} SchemaFactory.put(com.bol.measure.v1.Measurement.getClassSchema()); SchemaFactory.put(com.bol.measure.v2.Measurement.getClassSchema()); SchemaFactory.put(com.bol.measure.v3.Measurement.getClassSchema()); com.bol.measure.v1.Measurement measurement = DummyMeasurementFactory.createTestMeasurement(timestamp); byte[] bytesV1 = measurement.toBytes(); com.bol.measure.v2.Measurement newBornV2 = com.bol.measure.v2.Measurement.fromBytes(bytesV1); com.bol.measure.v3.Measurement newBornV3 = com.bol.measure.v3.Measurement.fromBytes(bytesV1); {code} Things currently missing: Documentation, extra tests, etc. I could really use some feedback on the structure of my change and advice on how to approach the need to call a 'close()' method on the schema storage part. Thanks. > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement >Reporter: Daniel Schierbeck > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'
[ https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15211839#comment-15211839 ] Niels Basjes commented on AVRO-1814: I did a quick test and this is bigger than just this one. It also applies to the TLD of the company actually writing the software. {code}@namespace("nl.basjes.test") protocol Hacking { record Hack { string nl; string org; string com; } }{code} gives similar errors about situations like this: {code}super(nl.basjes.test.Hack.SCHEMA$);{code} > 1.8 IDL generator broken when containing a field called 'org' > - > > Key: AVRO-1814 > URL: https://issues.apache.org/jira/browse/AVRO-1814 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.8.0 >Reporter: Dustin Spicuzza > > The problem is in the generated 'readExternal' and 'writeExternal' functions, > because they do something like: > WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out)); > When a member variable called 'org' exists, then the compile fails because > the compiler thinks that 'org' is a member variable and that 'apache cannot > be resolved or is not a field'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'
[ https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes reassigned AVRO-1814: -- Assignee: Niels Basjes > 1.8 IDL generator broken when containing a field called 'org' > - > > Key: AVRO-1814 > URL: https://issues.apache.org/jira/browse/AVRO-1814 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.8.0 >Reporter: Dustin Spicuzza >Assignee: Niels Basjes > > The problem is in the generated 'readExternal' and 'writeExternal' functions, > because they do something like: > WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out)); > When a member variable called 'org' exists, then the compile fails because > the compiler thinks that 'org' is a member variable and that 'apache cannot > be resolved or is not a field'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'
[ https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15257020#comment-15257020 ] Niels Basjes commented on AVRO-1814: I did that because I wanted an explicit test case to verify the specifics of the TLD of the namespace in addition to the main TLD 'org'. I guess it is fine to simply leave that part as-is (i.e. keep it org.apache. ). > 1.8 IDL generator broken when containing a field called 'org' > - > > Key: AVRO-1814 > URL: https://issues.apache.org/jira/browse/AVRO-1814 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.8.0 >Reporter: Dustin Spicuzza >Assignee: Niels Basjes > Attachments: AVRO-1814-20160410.patch > > > The problem is in the generated 'readExternal' and 'writeExternal' functions, > because they do something like: > WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out)); > When a member variable called 'org' exists, then the compile fails because > the compiler thinks that 'org' is a member variable and that 'apache cannot > be resolved or is not a field'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1834) Lower the Javadoc warnings on the generated code.
[ https://issues.apache.org/jira/browse/AVRO-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1834: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed > Lower the Javadoc warnings on the generated code. > - > > Key: AVRO-1834 > URL: https://issues.apache.org/jira/browse/AVRO-1834 > Project: Avro > Issue Type: Improvement > Components: java >Affects Versions: 1.8.0 >Reporter: Niels Basjes >Assignee: Niels Basjes > Fix For: 1.8.1 > > Attachments: AVRO-1834-2016-04-25.patch > > > I see a LOT of JavaDoc related warnings on the generated code in Java. > They are all about things like {{warning: no @param for}} and {{missing: > @return}}. > In my work project this results in hundreds of warnings so they obfuscate the > things that do need attention. > As these are generated I expect the required changes to be minimal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254092#comment-15254092 ] Niels Basjes commented on AVRO-1704: Question: What would be the preferred way of handling error situations like * Unknown schema fingerprint * Bad set of bytes (in various forms) I see at least in two general directions: # Return null # Throw an error What is preferred in this case? Which is 'better' for the application developers? > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement >Reporter: Daniel Schierbeck >Assignee: Niels Basjes > Attachments: AVRO-1704-20160410.patch > > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1835) Running tests using JDK 1.8 complains about MaxPermSize
[ https://issues.apache.org/jira/browse/AVRO-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1835: --- Attachment: AVRO-1835-2016-04-27.patch [~rdblue] Good idea to use a profile. The attached patch works for me on my normal system (uses 1.8) and within our docker image (uses 1.7). I verified by putting in some wrong values (causing lots of errors) that indeed in both cases the correct argLine value is selected when running tests. Please verify. > Running tests using JDK 1.8 complains about MaxPermSize > --- > > Key: AVRO-1835 > URL: https://issues.apache.org/jira/browse/AVRO-1835 > Project: Avro > Issue Type: Improvement > Components: java >Affects Versions: 1.8.0 >Reporter: Niels Basjes >Assignee: Niels Basjes > Fix For: 1.8.1 > > Attachments: AVRO-1835-2016-04-25.patch, AVRO-1835-2016-04-27.patch > > > When building AVRO under JDK 1.8 (as I assume most of us do) the output > contains the line {code}OpenJDK 64-Bit Server VM warning: ignoring option > MaxPermSize=200m; support was removed in 8.0{code}for every test class that > is run. > The the output becomes cluttered like this: > {code} > --- > T E S T S > --- > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestEncoders > Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.467 sec - > in org.apache.avro.io.TestEncoders > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestBlockingIO2 > Tests run: 84, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.157 sec - > in org.apache.avro.io.TestBlockingIO2 > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestBlockingIO > Tests run: 376, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.347 sec - > in org.apache.avro.io.TestBlockingIO > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator > Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.431 sec - > in org.apache.avro.io.parsing.TestResolvingGrammarGenerator > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator2 > Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.341 sec - > in org.apache.avro.io.parsing.TestResolvingGrammarGenerator2 > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestResolvingIOResolving > Tests run: 192, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.575 sec - > in org.apache.avro.io.TestResolvingIOResolving > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1834) Lower the Javadoc warnings on the generated code.
[ https://issues.apache.org/jira/browse/AVRO-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1834: --- Description: I see a LOT of JavaDoc related warnings on the generated code in Java. They are all about things like {{warning: no @param for}} and {{missing: @return}}. In my work project this results in hundreds of warnings so they obfuscate the things that do need attention. As these are generated I expect the required changes to be minimal. was:I see a LOT of JavaDoc related warnings on the generated code in Java > Lower the Javadoc warnings on the generated code. > - > > Key: AVRO-1834 > URL: https://issues.apache.org/jira/browse/AVRO-1834 > Project: Avro > Issue Type: Improvement > Components: java >Affects Versions: 1.8.0 >Reporter: Niels Basjes >Assignee: Niels Basjes > Fix For: 1.8.1 > > > I see a LOT of JavaDoc related warnings on the generated code in Java. > They are all about things like {{warning: no @param for}} and {{missing: > @return}}. > In my work project this results in hundreds of warnings so they obfuscate the > things that do need attention. > As these are generated I expect the required changes to be minimal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AVRO-1834) Lower the Javadoc warnings on the generated code.
Niels Basjes created AVRO-1834: -- Summary: Lower the Javadoc warnings on the generated code. Key: AVRO-1834 URL: https://issues.apache.org/jira/browse/AVRO-1834 Project: Avro Issue Type: Improvement Components: java Affects Versions: 1.8.0 Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 1.8.1 I see a LOT of JavaDoc related warnings on the generated code in Java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1834) Lower the Javadoc warnings on the generated code.
[ https://issues.apache.org/jira/browse/AVRO-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1834: --- Attachment: AVRO-1834-2016-04-25.patch This patch adds only 3 extra lines in the record.vm (and as a consequence changes the generated Player.java files). This change drops the number of Javadoc warnings in my own project from >100 to 0. > Lower the Javadoc warnings on the generated code. > - > > Key: AVRO-1834 > URL: https://issues.apache.org/jira/browse/AVRO-1834 > Project: Avro > Issue Type: Improvement > Components: java >Affects Versions: 1.8.0 >Reporter: Niels Basjes >Assignee: Niels Basjes > Fix For: 1.8.1 > > Attachments: AVRO-1834-2016-04-25.patch > > > I see a LOT of JavaDoc related warnings on the generated code in Java. > They are all about things like {{warning: no @param for}} and {{missing: > @return}}. > In my work project this results in hundreds of warnings so they obfuscate the > things that do need attention. > As these are generated I expect the required changes to be minimal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'
[ https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256213#comment-15256213 ] Niels Basjes commented on AVRO-1814: [~rdblue] can you please have a quick look at this one? Do we keep the old behavior (i.e. simply tell people "don't do this" and put this as a "Won't fix") or do we reduce the impact of this by means of the change I put in? > 1.8 IDL generator broken when containing a field called 'org' > - > > Key: AVRO-1814 > URL: https://issues.apache.org/jira/browse/AVRO-1814 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.8.0 >Reporter: Dustin Spicuzza >Assignee: Niels Basjes > Attachments: AVRO-1814-20160410.patch > > > The problem is in the generated 'readExternal' and 'writeExternal' functions, > because they do something like: > WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out)); > When a member variable called 'org' exists, then the compile fails because > the compiler thinks that 'org' is a member variable and that 'apache cannot > be resolved or is not a field'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1834) Lower the Javadoc warnings on the generated code.
[ https://issues.apache.org/jira/browse/AVRO-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1834: --- Status: Patch Available (was: Open) > Lower the Javadoc warnings on the generated code. > - > > Key: AVRO-1834 > URL: https://issues.apache.org/jira/browse/AVRO-1834 > Project: Avro > Issue Type: Improvement > Components: java >Affects Versions: 1.8.0 >Reporter: Niels Basjes >Assignee: Niels Basjes > Fix For: 1.8.1 > > Attachments: AVRO-1834-2016-04-25.patch > > > I see a LOT of JavaDoc related warnings on the generated code in Java. > They are all about things like {{warning: no @param for}} and {{missing: > @return}}. > In my work project this results in hundreds of warnings so they obfuscate the > things that do need attention. > As these are generated I expect the required changes to be minimal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1835) Running tests using JDK 1.8 complains about MaxPermSize
[ https://issues.apache.org/jira/browse/AVRO-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1835: --- Status: Patch Available (was: Open) > Running tests using JDK 1.8 complains about MaxPermSize > --- > > Key: AVRO-1835 > URL: https://issues.apache.org/jira/browse/AVRO-1835 > Project: Avro > Issue Type: Improvement > Components: java >Affects Versions: 1.8.0 >Reporter: Niels Basjes >Assignee: Niels Basjes > Fix For: 1.8.1 > > Attachments: AVRO-1835-2016-04-25.patch > > > When building AVRO under JDK 1.8 (as I assume most of us do) the output > contains the line {code}OpenJDK 64-Bit Server VM warning: ignoring option > MaxPermSize=200m; support was removed in 8.0{code}for every test class that > is run. > The the output becomes cluttered like this: > {code} > --- > T E S T S > --- > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestEncoders > Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.467 sec - > in org.apache.avro.io.TestEncoders > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestBlockingIO2 > Tests run: 84, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.157 sec - > in org.apache.avro.io.TestBlockingIO2 > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestBlockingIO > Tests run: 376, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.347 sec - > in org.apache.avro.io.TestBlockingIO > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator > Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.431 sec - > in org.apache.avro.io.parsing.TestResolvingGrammarGenerator > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator2 > Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.341 sec - > in org.apache.avro.io.parsing.TestResolvingGrammarGenerator2 > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestResolvingIOResolving > Tests run: 192, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.575 sec - > in org.apache.avro.io.TestResolvingIOResolving > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1835) Running tests using JDK 1.8 complains about MaxPermSize
[ https://issues.apache.org/jira/browse/AVRO-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1835: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed > Running tests using JDK 1.8 complains about MaxPermSize > --- > > Key: AVRO-1835 > URL: https://issues.apache.org/jira/browse/AVRO-1835 > Project: Avro > Issue Type: Improvement > Components: java >Affects Versions: 1.8.0 >Reporter: Niels Basjes >Assignee: Niels Basjes > Fix For: 1.8.1 > > Attachments: AVRO-1835-2016-04-25.patch, AVRO-1835-2016-04-27.patch > > > When building AVRO under JDK 1.8 (as I assume most of us do) the output > contains the line {code}OpenJDK 64-Bit Server VM warning: ignoring option > MaxPermSize=200m; support was removed in 8.0{code}for every test class that > is run. > The the output becomes cluttered like this: > {code} > --- > T E S T S > --- > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestEncoders > Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.467 sec - > in org.apache.avro.io.TestEncoders > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestBlockingIO2 > Tests run: 84, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.157 sec - > in org.apache.avro.io.TestBlockingIO2 > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestBlockingIO > Tests run: 376, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.347 sec - > in org.apache.avro.io.TestBlockingIO > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator > Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.431 sec - > in org.apache.avro.io.parsing.TestResolvingGrammarGenerator > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator2 > Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.341 sec - > in org.apache.avro.io.parsing.TestResolvingGrammarGenerator2 > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestResolvingIOResolving > Tests run: 192, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.575 sec - > in org.apache.avro.io.TestResolvingIOResolving > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (AVRO-1828) Add EditorConfig file
[ https://issues.apache.org/jira/browse/AVRO-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes reassigned AVRO-1828: -- Assignee: Niels Basjes > Add EditorConfig file > - > > Key: AVRO-1828 > URL: https://issues.apache.org/jira/browse/AVRO-1828 > Project: Avro > Issue Type: Improvement >Reporter: Niels Basjes >Assignee: Niels Basjes > > I was working with Apache Flink last week and they recently implemented > http://editorconfig.org/ ( see here > https://github.com/apache/flink/blob/master/.editorconfig ) > Essentially this is a very simple config file that instructs a great many > editors to adhere to the main coding standard choices (things like character > encoding, tabs v.s. spaces , newlines, etc) for a specific project on a per > file type basis. > When someone opens the project in a intelliJ then this will automatically use > these settings. > Proposal: > # We implement this for Avro at the root level with global defaults. > # We implement a specific file per language. I think we should start with the > top level scripting (like build.sh and pom.xml) and Java as the first > language. > # We fix the violations of this standard in a single commit per language. > Note that if we don't fix those violations then later commits will be > 'harder' to keep clean (you will see a lot of unrelated changes) because the > IDEs will 'enforce' the standard on all touched files. > What do you guys think? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1814) Generated java code fails on variables with a TLD name like 'org'
[ https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1814: --- Resolution: Fixed Release Note: Using a variable name that also happens to be a toplevel domain name (like 'org') no longer causes errors. Status: Resolved (was: Patch Available) Committed > Generated java code fails on variables with a TLD name like 'org' > - > > Key: AVRO-1814 > URL: https://issues.apache.org/jira/browse/AVRO-1814 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.8.0 >Reporter: Dustin Spicuzza >Assignee: Niels Basjes > Attachments: AVRO-1814-20160410.patch, AVRO-1814-20160428.patch > > > The problem is in the generated 'readExternal' and 'writeExternal' functions, > because they do something like: > WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out)); > When a member variable called 'org' exists, then the compile fails because > the compiler thinks that 'org' is a member variable and that 'apache cannot > be resolved or is not a field'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1814) 1.8 IDL generator broken when containing a field called 'org'
[ https://issues.apache.org/jira/browse/AVRO-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1814: --- Attachment: AVRO-1814-20160428.patch I put the original namespace back. This wasn't really needed to verify the problem. > 1.8 IDL generator broken when containing a field called 'org' > - > > Key: AVRO-1814 > URL: https://issues.apache.org/jira/browse/AVRO-1814 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.8.0 >Reporter: Dustin Spicuzza >Assignee: Niels Basjes > Attachments: AVRO-1814-20160410.patch, AVRO-1814-20160428.patch > > > The problem is in the generated 'readExternal' and 'writeExternal' functions, > because they do something like: > WRITER$.write(this, org.apache.avro.specific.SpecificData.getEncoder(out)); > When a member variable called 'org' exists, then the compile fails because > the compiler thinks that 'org' is a member variable and that 'apache cannot > be resolved or is not a field'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1828) Add EditorConfig file
[ https://issues.apache.org/jira/browse/AVRO-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1828: --- Attachment: AVRO-1828-2016-04-28.patch This is the .editorconfig file for .sh, .xml and .java. This patch also includes all these changes for the affected files: - Remove trailing spaces and tabs - Remove (leading) tabs For the files where there were leading tabs I fixed the indentation (like in the toplevel build.sh and the build.sh scripts for several of the languages) I chose not to touch the leading tabs in the documentation files at this moment. This is mostly about spaces and tabs; so after applying this patch a command like {{git diff -w}} will yield almost no changes. I need to run the full test set (all languages) on this one. I ran Java and that passed. > Add EditorConfig file > - > > Key: AVRO-1828 > URL: https://issues.apache.org/jira/browse/AVRO-1828 > Project: Avro > Issue Type: Improvement >Reporter: Niels Basjes >Assignee: Niels Basjes > Attachments: AVRO-1828-2016-04-28.patch > > > I was working with Apache Flink last week and they recently implemented > http://editorconfig.org/ ( see here > https://github.com/apache/flink/blob/master/.editorconfig ) > Essentially this is a very simple config file that instructs a great many > editors to adhere to the main coding standard choices (things like character > encoding, tabs v.s. spaces , newlines, etc) for a specific project on a per > file type basis. > When someone opens the project in a intelliJ then this will automatically use > these settings. > Proposal: > # We implement this for Avro at the root level with global defaults. > # We implement a specific file per language. I think we should start with the > top level scripting (like build.sh and pom.xml) and Java as the first > language. > # We fix the violations of this standard in a single commit per language. > Note that if we don't fix those violations then later commits will be > 'harder' to keep clean (you will see a lot of unrelated changes) because the > IDEs will 'enforce' the standard on all touched files. > What do you guys think? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1828) Add EditorConfig file
[ https://issues.apache.org/jira/browse/AVRO-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1828: --- Attachment: AVRO-1828-2016-04-28-ratfix.patch The new file failed {{./build.sh rat}}. > Add EditorConfig file > - > > Key: AVRO-1828 > URL: https://issues.apache.org/jira/browse/AVRO-1828 > Project: Avro > Issue Type: Improvement >Affects Versions: 1.8.0 >Reporter: Niels Basjes >Assignee: Niels Basjes > Attachments: AVRO-1828-2016-04-28-ratfix.patch, > AVRO-1828-2016-04-28.patch > > > I was working with Apache Flink last week and they recently implemented > http://editorconfig.org/ ( see here > https://github.com/apache/flink/blob/master/.editorconfig ) > Essentially this is a very simple config file that instructs a great many > editors to adhere to the main coding standard choices (things like character > encoding, tabs v.s. spaces , newlines, etc) for a specific project on a per > file type basis. > When someone opens the project in a intelliJ then this will automatically use > these settings. > Proposal: > # We implement this for Avro at the root level with global defaults. > # We implement a specific file per language. I think we should start with the > top level scripting (like build.sh and pom.xml) and Java as the first > language. > # We fix the violations of this standard in a single commit per language. > Note that if we don't fix those violations then later commits will be > 'harder' to keep clean (you will see a lot of unrelated changes) because the > IDEs will 'enforce' the standard on all touched files. > What do you guys think? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1828) Add EditorConfig file
[ https://issues.apache.org/jira/browse/AVRO-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1828: --- Affects Version/s: 1.8.0 Status: Patch Available (was: Open) Please review / comment > Add EditorConfig file > - > > Key: AVRO-1828 > URL: https://issues.apache.org/jira/browse/AVRO-1828 > Project: Avro > Issue Type: Improvement >Affects Versions: 1.8.0 >Reporter: Niels Basjes >Assignee: Niels Basjes > Attachments: AVRO-1828-2016-04-28-ratfix.patch, > AVRO-1828-2016-04-28.patch > > > I was working with Apache Flink last week and they recently implemented > http://editorconfig.org/ ( see here > https://github.com/apache/flink/blob/master/.editorconfig ) > Essentially this is a very simple config file that instructs a great many > editors to adhere to the main coding standard choices (things like character > encoding, tabs v.s. spaces , newlines, etc) for a specific project on a per > file type basis. > When someone opens the project in a intelliJ then this will automatically use > these settings. > Proposal: > # We implement this for Avro at the root level with global defaults. > # We implement a specific file per language. I think we should start with the > top level scripting (like build.sh and pom.xml) and Java as the first > language. > # We fix the violations of this standard in a single commit per language. > Note that if we don't fix those violations then later commits will be > 'harder' to keep clean (you will see a lot of unrelated changes) because the > IDEs will 'enforce' the standard on all touched files. > What do you guys think? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1704) Standardized format for encoding messages with Avro
[ https://issues.apache.org/jira/browse/AVRO-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1704: --- Status: Open (was: Patch Available) > Standardized format for encoding messages with Avro > --- > > Key: AVRO-1704 > URL: https://issues.apache.org/jira/browse/AVRO-1704 > Project: Avro > Issue Type: Improvement >Reporter: Daniel Schierbeck >Assignee: Niels Basjes > Attachments: AVRO-1704-20160410.patch > > > I'm currently using the Datafile format for encoding messages that are > written to Kafka and Cassandra. This seems rather wasteful: > 1. I only encode a single record at a time, so there's no need for sync > markers and other metadata related to multi-record files. > 2. The entire schema is inlined every time. > However, the Datafile format is the only one that has been standardized, > meaning that I can read and write data with minimal effort across the various > languages in use in my organization. If there was a standardized format for > encoding single values that was optimized for out-of-band schema transfer, I > would much rather use that. > I think the necessary pieces of the format would be: > 1. A format version number. > 2. A schema fingerprint type identifier, i.e. Rabin, MD5, SHA256, etc. > 3. The actual schema fingerprint (according to the type.) > 4. Optional metadata map. > 5. The encoded datum. > The language libraries would implement a MessageWriter that would encode > datums in this format, as well as a MessageReader that, given a SchemaStore, > would be able to decode datums. The reader would decode the fingerprint and > ask its SchemaStore to return the corresponding writer's schema. > The idea is that SchemaStore would be an abstract interface that allowed > library users to inject custom backends. A simple, file system based one > could be provided out of the box. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AVRO-1835) Running tests using JDK 1.8 complains about MaxPermSize
Niels Basjes created AVRO-1835: -- Summary: Running tests using JDK 1.8 complains about MaxPermSize Key: AVRO-1835 URL: https://issues.apache.org/jira/browse/AVRO-1835 Project: Avro Issue Type: Improvement Components: java Affects Versions: 1.8.0 Reporter: Niels Basjes Assignee: Niels Basjes Fix For: 1.8.1 When building AVRO under JDK 1.8 (as I assume most of us do) the output contains the line {code}OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support was removed in 8.0{code}for every test class that is run. The the output becomes cluttered like this: {code} --- T E S T S --- OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support was removed in 8.0 Running org.apache.avro.io.TestEncoders Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.467 sec - in org.apache.avro.io.TestEncoders OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support was removed in 8.0 Running org.apache.avro.io.TestBlockingIO2 Tests run: 84, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.157 sec - in org.apache.avro.io.TestBlockingIO2 OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support was removed in 8.0 Running org.apache.avro.io.TestBlockingIO Tests run: 376, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.347 sec - in org.apache.avro.io.TestBlockingIO OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support was removed in 8.0 Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.431 sec - in org.apache.avro.io.parsing.TestResolvingGrammarGenerator OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support was removed in 8.0 Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator2 Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.341 sec - in org.apache.avro.io.parsing.TestResolvingGrammarGenerator2 OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support was removed in 8.0 Running org.apache.avro.io.TestResolvingIOResolving Tests run: 192, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.575 sec - in org.apache.avro.io.TestResolvingIOResolving {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1835) Running tests using JDK 1.8 complains about MaxPermSize
[ https://issues.apache.org/jira/browse/AVRO-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1835: --- Attachment: AVRO-1835-2016-04-25.patch As this ONLY affects running tests I suspect it is safe to assume we have Java 1.8 available. I'm not sure about the current status of systems like Jenkins. [~rdblue] perhaps we wait with this one until we have completed AVRO-1705 ? > Running tests using JDK 1.8 complains about MaxPermSize > --- > > Key: AVRO-1835 > URL: https://issues.apache.org/jira/browse/AVRO-1835 > Project: Avro > Issue Type: Improvement > Components: java >Affects Versions: 1.8.0 >Reporter: Niels Basjes >Assignee: Niels Basjes > Fix For: 1.8.1 > > Attachments: AVRO-1835-2016-04-25.patch > > > When building AVRO under JDK 1.8 (as I assume most of us do) the output > contains the line {code}OpenJDK 64-Bit Server VM warning: ignoring option > MaxPermSize=200m; support was removed in 8.0{code}for every test class that > is run. > The the output becomes cluttered like this: > {code} > --- > T E S T S > --- > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestEncoders > Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.467 sec - > in org.apache.avro.io.TestEncoders > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestBlockingIO2 > Tests run: 84, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.157 sec - > in org.apache.avro.io.TestBlockingIO2 > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestBlockingIO > Tests run: 376, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.347 sec - > in org.apache.avro.io.TestBlockingIO > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator > Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.431 sec - > in org.apache.avro.io.parsing.TestResolvingGrammarGenerator > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.parsing.TestResolvingGrammarGenerator2 > Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.341 sec - > in org.apache.avro.io.parsing.TestResolvingGrammarGenerator2 > OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=200m; support > was removed in 8.0 > Running org.apache.avro.io.TestResolvingIOResolving > Tests run: 192, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.575 sec - > in org.apache.avro.io.TestResolvingIOResolving > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1841) Add clientside githooks to do basic commit validation
[ https://issues.apache.org/jira/browse/AVRO-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1841: --- Summary: Add clientside githooks to do basic commit validation (was: Automatically verify the commit messages ) > Add clientside githooks to do basic commit validation > - > > Key: AVRO-1841 > URL: https://issues.apache.org/jira/browse/AVRO-1841 > Project: Avro > Issue Type: Improvement > Components: build >Reporter: Niels Basjes >Assignee: Niels Basjes > Fix For: 1.8.1 > > Attachments: AVRO-1841-20160507.patch > > > Last week I made a commit and I made an error: The commit message was not > fully according to the right format. > To avoid future mistakes I propose we introduce validation to the commit > message by using the git hooks. > These can be run in two places: client side and server side. > This ticket focuses on the client side hooks. If we decide to add this also > to the server side that must be a separate ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AVRO-1841) Add clientside githooks to do basic commit validation
[ https://issues.apache.org/jira/browse/AVRO-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated AVRO-1841: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed > Add clientside githooks to do basic commit validation > - > > Key: AVRO-1841 > URL: https://issues.apache.org/jira/browse/AVRO-1841 > Project: Avro > Issue Type: Improvement > Components: build >Reporter: Niels Basjes >Assignee: Niels Basjes > Fix For: 1.8.1 > > Attachments: AVRO-1841-20160507.patch > > > Last week I made a commit and I made an error: The commit message was not > fully according to the right format. > To avoid future mistakes I propose we introduce validation to the commit > message by using the git hooks. > These can be run in two places: client side and server side. > This ticket focuses on the client side hooks. If we decide to add this also > to the server side that must be a separate ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AVRO-1845) Invoking toString() method unexpectedly modified the avro record.
Niels Basjes created AVRO-1845: -- Summary: Invoking toString() method unexpectedly modified the avro record. Key: AVRO-1845 URL: https://issues.apache.org/jira/browse/AVRO-1845 Project: Avro Issue Type: Bug Reporter: Niels Basjes Assignee: Niels Basjes Priority: Critical Reported by Oleksandr Didukh (guthub uid: sashadidukh) When calling the toString method on a record that has a bytearray this apparently changes the original data. Oleksandr put up a merge request in : https://github.com/apache/avro/pull/88 -- This message was sent by Atlassian JIRA (v6.3.4#6332)