Re: [REPORT] Board report for Apache Avro Apr 2020

2020-04-09 Thread Andy Le
Thank you Sean. Gonna have a look at it

On 2020/04/10 02:46:26, Sean Busbey  wrote: 
> Hi Andy!
> 
> As a project we maintain a guide for folks looking to get started
> contributing to the project:
> 
> https://cwiki.apache.org/confluence/display/AVRO/How+To+Contribute
> 
> 
> On Thu, Apr 9, 2020 at 5:35 AM Andy Le  wrote:
> >
> > Hi Sean,
> >
> > It's great to receive your summary. I'm new to Apache Avro project. Is 
> > there any guideline for how to become a member of Apache Avro?
> >
> > Thanks & be strong!
> >
> > On 2020/04/09 05:12:16, Sean Busbey  wrote:
> > > Hi folks!
> > >
> > > Here's the report for the quarter I submitted. Let me know if there are 
> > > any
> > > last minute changes y'all would like to see.
> > >
> > > 
> > > ## Description:
> > > Apache Avro is a data serialization system with a compact binary format. 
> > > It is
> > > used for storing and transporting schema driven serialized data. The 
> > > unique
> > > features of Avro include automatic schema resolution - when the reader's
> > > expected schema is different from the actual schema with which the data 
> > > was
> > > serialized the data is automatically adapted to meet reader's 
> > > requirements.
> > >
> > > ## Issues:
> > > The project currently has no issues that require board attention.
> > >
> > > ## Membership Data:
> > > Apache Avro was founded 2010-04-20 (10 years ago)
> > > There are currently 33 committers and 23 PMC members in this project.
> > > The Committer-to-PMC ratio is roughly 3:2.
> > >
> > > Community changes, past quarter:
> > > - No new PMC members. Last addition was Nándor Kollár on 2019-08-29.
> > > - No new committers. Last addition was Ryan Skraba on 2019-12-12.
> > >
> > > ## Project Activity:
> > > Apache Avro 1.9.2 was released on 2020-02-12. This release included some 
> > > new
> > > experimental features that try to improve performance[1].
> > >
> > > Work has continued to update both build tools, language versions, third 
> > > party
> > > dependencies, and ease of integration in preparation for a new major 
> > > release
> > > currently planned for May 2020, version 1.10.0.
> > >
> > > The previously reported need to document and update how the project 
> > > versions
> > > releases came up for discussion again but no action has been taken yet.
> > >
> > > ## Numbers
> > > For those who prefer metrics:
> > >
> > > Mailing Lists:
> > >  - dev@avro.apache.org had 1034 emails (24% increase)
> > >  - u...@avro.apache.org had 91 emails (15% increase)
> > >
> > > JIRA:
> > >  - 115 issues opened (42% increase)
> > >  - 76 issues closed (7% increase)
> > >
> > > GitHub:
> > >  - 88 PRs open (9% increase)
> > >  - 68 PRs closed (7% decrease)
> > >
> > > Code Repository:
> > >  - 122 commits in the past quarter (67% increase)
> > >  - 25 code contributors in the past quarter (39% increase)
> > >
> > > ## Community Health:
> > > Community health is doing well at drawing in contributions. The PMC still
> > > needs to work to recognize contributors through committership. Current 
> > > focus
> > > is on working towards releases.
> > >
> > > [1]: user facing details about these experimental additions is available:
> > >  https://s.apache.org/6pcpo
> > >
> 


Re: [REPORT] Board report for Apache Avro Apr 2020

2020-04-09 Thread Sean Busbey
Hi Andy!

As a project we maintain a guide for folks looking to get started
contributing to the project:

https://cwiki.apache.org/confluence/display/AVRO/How+To+Contribute


On Thu, Apr 9, 2020 at 5:35 AM Andy Le  wrote:
>
> Hi Sean,
>
> It's great to receive your summary. I'm new to Apache Avro project. Is there 
> any guideline for how to become a member of Apache Avro?
>
> Thanks & be strong!
>
> On 2020/04/09 05:12:16, Sean Busbey  wrote:
> > Hi folks!
> >
> > Here's the report for the quarter I submitted. Let me know if there are any
> > last minute changes y'all would like to see.
> >
> > 
> > ## Description:
> > Apache Avro is a data serialization system with a compact binary format. It 
> > is
> > used for storing and transporting schema driven serialized data. The unique
> > features of Avro include automatic schema resolution - when the reader's
> > expected schema is different from the actual schema with which the data was
> > serialized the data is automatically adapted to meet reader's requirements.
> >
> > ## Issues:
> > The project currently has no issues that require board attention.
> >
> > ## Membership Data:
> > Apache Avro was founded 2010-04-20 (10 years ago)
> > There are currently 33 committers and 23 PMC members in this project.
> > The Committer-to-PMC ratio is roughly 3:2.
> >
> > Community changes, past quarter:
> > - No new PMC members. Last addition was Nándor Kollár on 2019-08-29.
> > - No new committers. Last addition was Ryan Skraba on 2019-12-12.
> >
> > ## Project Activity:
> > Apache Avro 1.9.2 was released on 2020-02-12. This release included some new
> > experimental features that try to improve performance[1].
> >
> > Work has continued to update both build tools, language versions, third 
> > party
> > dependencies, and ease of integration in preparation for a new major release
> > currently planned for May 2020, version 1.10.0.
> >
> > The previously reported need to document and update how the project versions
> > releases came up for discussion again but no action has been taken yet.
> >
> > ## Numbers
> > For those who prefer metrics:
> >
> > Mailing Lists:
> >  - dev@avro.apache.org had 1034 emails (24% increase)
> >  - u...@avro.apache.org had 91 emails (15% increase)
> >
> > JIRA:
> >  - 115 issues opened (42% increase)
> >  - 76 issues closed (7% increase)
> >
> > GitHub:
> >  - 88 PRs open (9% increase)
> >  - 68 PRs closed (7% decrease)
> >
> > Code Repository:
> >  - 122 commits in the past quarter (67% increase)
> >  - 25 code contributors in the past quarter (39% increase)
> >
> > ## Community Health:
> > Community health is doing well at drawing in contributions. The PMC still
> > needs to work to recognize contributors through committership. Current focus
> > is on working towards releases.
> >
> > [1]: user facing details about these experimental additions is available:
> >  https://s.apache.org/6pcpo
> >


[jira] [Commented] (AVRO-2775) JacksonUtils: exception when calling toJsonNode()

2020-04-09 Thread Andy Le (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080173#comment-17080173
 ] 

Andy Le commented on AVRO-2775:
---

[~cutting]

> changing this would permit them to include arbitrary Java data structures

It's correct. Such structures may be represented as Maps

> which might obscure a bug if this was not intended

Would you elaborate more? 

> JacksonUtils: exception when calling toJsonNode() 
> --
>
> Key: AVRO-2775
> URL: https://issues.apache.org/jira/browse/AVRO-2775
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.9.2
>Reporter: Andy Le
>Priority: Major
>
> I've got a simple test as followed
> {code:java}
> public class TestJacksonUtils {
>   public static class Age{
> public int value = 9;
>   }
>   @Test
>   public void testToJson(){
> Map kv = new HashMap<>();
> kv.put("age", 9);
> JsonNode node1 = JacksonUtils.toJsonNode(kv); // -> This is OK
> Object obj = new Age();
> JsonNode node2 = JacksonUtils.toJsonNode(obj); // -> This will trigger an 
> exception
>   }
> }
> {code}
> When I ran the test:
> {noformat}
> org.apache.avro.AvroRuntimeException: Unknown datum class: class 
> org.apache.avro.util.internal.TestJacksonUtils$Age
>   at 
> org.apache.avro.util.internal.JacksonUtils.toJson(JacksonUtils.java:87)
>   at 
> org.apache.avro.util.internal.JacksonUtils.toJsonNode(JacksonUtils.java:48)
>   at 
> org.apache.avro.util.internal.TestJacksonUtils.testToJson(TestJacksonUtils.java:20)
> {noformat}
> I've read the code & tests for JacksonUtils. Instead of raising exceptions at 
> [line 
> #87|https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/util/internal/JacksonUtils.java#L87],
>  I see we can auto convert objects into maps, every thing's gonna fine.
> My question is:
> - Is raising exception acceptable?
> - Any other way to have `toJsonNode` for general objects?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AVRO-2775) JacksonUtils: exception when calling toJsonNode()

2020-04-09 Thread Doug Cutting (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079561#comment-17079561
 ] 

Doug Cutting commented on AVRO-2775:


The only downside I can see is that for applications that only intend to use 
the types listed in 
[JsonProperties|https://avro.apache.org/docs/current/api/java/org/apache/avro/JsonProperties.html],
 changing this would permit them to include arbitrary Java data structures, 
which might obscure a bug if this was not intended.  Are there other 
compatibility risks to adding this functionality?

> JacksonUtils: exception when calling toJsonNode() 
> --
>
> Key: AVRO-2775
> URL: https://issues.apache.org/jira/browse/AVRO-2775
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.9.2
>Reporter: Andy Le
>Priority: Major
>
> I've got a simple test as followed
> {code:java}
> public class TestJacksonUtils {
>   public static class Age{
> public int value = 9;
>   }
>   @Test
>   public void testToJson(){
> Map kv = new HashMap<>();
> kv.put("age", 9);
> JsonNode node1 = JacksonUtils.toJsonNode(kv); // -> This is OK
> Object obj = new Age();
> JsonNode node2 = JacksonUtils.toJsonNode(obj); // -> This will trigger an 
> exception
>   }
> }
> {code}
> When I ran the test:
> {noformat}
> org.apache.avro.AvroRuntimeException: Unknown datum class: class 
> org.apache.avro.util.internal.TestJacksonUtils$Age
>   at 
> org.apache.avro.util.internal.JacksonUtils.toJson(JacksonUtils.java:87)
>   at 
> org.apache.avro.util.internal.JacksonUtils.toJsonNode(JacksonUtils.java:48)
>   at 
> org.apache.avro.util.internal.TestJacksonUtils.testToJson(TestJacksonUtils.java:20)
> {noformat}
> I've read the code & tests for JacksonUtils. Instead of raising exceptions at 
> [line 
> #87|https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/util/internal/JacksonUtils.java#L87],
>  I see we can auto convert objects into maps, every thing's gonna fine.
> My question is:
> - Is raising exception acceptable?
> - Any other way to have `toJsonNode` for general objects?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AVRO-2723) Avro Java: Obtaining default field values for POJO objects with ReflectData

2020-04-09 Thread Doug Cutting (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079534#comment-17079534
 ] 

Doug Cutting commented on AVRO-2723:


Is there a reason not to use the default value from the Java class as the 
default value in the schema?  Perhaps this could be an optional feature?  Would 
this break existing applications?

> Avro Java: Obtaining default field values for POJO objects with ReflectData
> ---
>
> Key: AVRO-2723
> URL: https://issues.apache.org/jira/browse/AVRO-2723
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: java
>Affects Versions: 1.9.1
>Reporter: Andy Le
>Priority: Critical
> Attachments: Screen Shot 2020-03-08 at 16.13.29.png
>
>
> Hi guys,
>  
> I've got a simple app using Avro Reflection:
>  
> {code:java}
> public class App {
>   public static void main(String[] args) {
> testReflection();
>   }
>   static class User {
> public String first = "Andy";
> public String last = "Le";
>   }
>   static void testReflection(){
> // get the reflected schema for packets
> Schema schema = ReflectData.AllowNull.get().getSchema(User.class);
> System.out.println(schema.toString(true));
>   }
> {code}
> The output on console will be:
> {noformat}
> {
>   "type" : "record",
>   "name" : "User",
>   "namespace" : "App",
>   "fields" : [ {
> "name" : "first",
> "type" : [ "null", "string" ],
> "default" : null
>   }, {
> "name" : "last",
> "type" : [ "null", "string" ],
> "default" : null
>   } ]
> }
> {noformat}
>  
> As you can see, there's no default values for fields. Would you please tell 
> me how to obtain such values?
> Thank you.
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AVRO-2793) Schema compatibilty should consider fullname of records

2020-04-09 Thread Doug Cutting (Jira)


[ 
https://issues.apache.org/jira/browse/AVRO-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079492#comment-17079492
 ] 

Doug Cutting commented on AVRO-2793:


Looks like a bug to me.  Can you submit a pull request that fixes this, with 
tests?  Thanks!

> Schema compatibilty should consider fullname of records
> ---
>
> Key: AVRO-2793
> URL: https://issues.apache.org/jira/browse/AVRO-2793
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.9.2
>Reporter: Jurgis Pods
>Priority: Major
>
> Consider the following example:
> {code:java}
> Schema writerSchema = Schema.createRecord("fieldname", null, "namespace1", 
> false, Collections.emptyList());
> Schema readerSchema = Schema.createRecord("fieldname", null, "namespace2", 
> false, Collections.emptyList());
> // compat.getType() should be SchemaCompatibilityType.INCOMPATIBLE, but is 
> actually SchemaCompatibilityType.COMPATIBLE  
> SchemaPairCompatibility compat = 
> SchemaCompatibility.checkReaderWriterCompatibility(readerSchema, 
> writerSchema2){code}
> I would expect the validation to yield an incompatible result, as records 
> should have identical fullnames.
> This issue is similar to AVRO-2322, but vice versa: Here the namespace 
> differs, not the record name.
> The root cause seems to be in 
> [SchemaCompatibility::schemaNameEquals|https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/SchemaCompatibility.java#L97],
>  where getName() is used instead of getFullName().
> Is there any reason not to be strict here and use the fullname for 
> validation? We ran into severe problems after changing a record's namespace 
> in a newer schema version. The Avro schema compatibiltiy check ran through 
> fine, so we deployed with confidence. However, the change then caused 
> problems both for Confluent's Kafka S3 Connector as well as for Amazon Athena 
> when reading data produced by the new schema.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [REPORT] Board report for Apache Avro Apr 2020

2020-04-09 Thread Andy Le
Hi Sean,

It's great to receive your summary. I'm new to Apache Avro project. Is there 
any guideline for how to become a member of Apache Avro? 

Thanks & be strong!

On 2020/04/09 05:12:16, Sean Busbey  wrote: 
> Hi folks!
> 
> Here's the report for the quarter I submitted. Let me know if there are any
> last minute changes y'all would like to see.
> 
> 
> ## Description:
> Apache Avro is a data serialization system with a compact binary format. It is
> used for storing and transporting schema driven serialized data. The unique
> features of Avro include automatic schema resolution - when the reader's
> expected schema is different from the actual schema with which the data was
> serialized the data is automatically adapted to meet reader's requirements.
> 
> ## Issues:
> The project currently has no issues that require board attention.
> 
> ## Membership Data:
> Apache Avro was founded 2010-04-20 (10 years ago)
> There are currently 33 committers and 23 PMC members in this project.
> The Committer-to-PMC ratio is roughly 3:2.
> 
> Community changes, past quarter:
> - No new PMC members. Last addition was Nándor Kollár on 2019-08-29.
> - No new committers. Last addition was Ryan Skraba on 2019-12-12.
> 
> ## Project Activity:
> Apache Avro 1.9.2 was released on 2020-02-12. This release included some new
> experimental features that try to improve performance[1].
> 
> Work has continued to update both build tools, language versions, third party
> dependencies, and ease of integration in preparation for a new major release
> currently planned for May 2020, version 1.10.0.
> 
> The previously reported need to document and update how the project versions
> releases came up for discussion again but no action has been taken yet.
> 
> ## Numbers
> For those who prefer metrics:
> 
> Mailing Lists:
>  - dev@avro.apache.org had 1034 emails (24% increase)
>  - u...@avro.apache.org had 91 emails (15% increase)
> 
> JIRA:
>  - 115 issues opened (42% increase)
>  - 76 issues closed (7% increase)
> 
> GitHub:
>  - 88 PRs open (9% increase)
>  - 68 PRs closed (7% decrease)
> 
> Code Repository:
>  - 122 commits in the past quarter (67% increase)
>  - 25 code contributors in the past quarter (39% increase)
> 
> ## Community Health:
> Community health is doing well at drawing in contributions. The PMC still
> needs to work to recognize contributors through committership. Current focus
> is on working towards releases.
> 
> [1]: user facing details about these experimental additions is available:
>  https://s.apache.org/6pcpo
>