[jira] [Commented] (AVRO-1756) Python implementations

2015-12-14 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057111#comment-15057111
 ] 

Ryan Blue commented on AVRO-1756:
-

[~tebeka], we just finished migrating JS so I should have time to help out with 
merging the python implementations. What can I do to help this process along?

> Python implementations
> --
>
> Key: AVRO-1756
> URL: https://issues.apache.org/jira/browse/AVRO-1756
> Project: Avro
>  Issue Type: Improvement
>  Components: python
>Reporter: Ryan Blue
>
> Fastavro supports python 2 and 3, is fast, and its author is open to working 
> with the Apache community. This issue is to track importing fastavro as a new 
> lang/python implementation. Follow-on issues can track unifying the py and 
> py3 APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1584) Json output doesn't generate base64 for byte arrays

2015-12-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057147#comment-15057147
 ] 

Hudson commented on AVRO-1584:
--

SUCCESS: Integrated in AvroJava #560 (See 
[https://builds.apache.org/job/AvroJava/560/])
AVRO-1584: Java: Escape characters not allowed in JSON in toString.

>From the JSON spec: "All Unicode characters may be placed within the
quotation marks except for the characters that must be escaped:
quotation mark, reverse solidus, and the control characters (U+
through U+001F)."

This uses the existing string escape function. (blue: rev 1720055)
* trunk/CHANGES.txt
* trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java
* 
trunk/lang/java/avro/src/test/java/org/apache/avro/generic/TestGenericData.java


> Json output doesn't generate base64 for byte arrays
> ---
>
> Key: AVRO-1584
> URL: https://issues.apache.org/jira/browse/AVRO-1584
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Pure java.
>Reporter: Christophe Lorenz
> Attachments: AVRO-1584-Jackson-Base64-Default-Variant.patch, 
> AVRO-1584.1.patch, AVRO-1584.patch
>
>
> The Json output of java generated code doesn't correctly encode byte arrays.
> Using this simple schema : 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "ByteArrayEncoding",
>  "fields": [ {"name": "data", "type": "bytes"} ]
> }
> The toString()  
>   System.out.println(new ByteArrayEncoding(ByteBuffer.wrap(new 
> byte[]{0,31,65,66,67,(byte)255,(byte)182})));
> Returns raw bytes to string in the json :
> {"data": {"bytes": "  ABC??"}}
> As a byte array is not tied to be a valid string, it should be converted back 
> and forth to Base64 like other Json implementations : 
> {"data": {"bytes": "AB9BQkP/tg=="}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1584) Json output doesn't generate base64 for byte arrays

2015-12-14 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057074#comment-15057074
 ] 

Doug Cutting commented on AVRO-1584:


Ryan, your patch looks good to me.  +1

> Json output doesn't generate base64 for byte arrays
> ---
>
> Key: AVRO-1584
> URL: https://issues.apache.org/jira/browse/AVRO-1584
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Pure java.
>Reporter: Christophe Lorenz
> Attachments: AVRO-1584-Jackson-Base64-Default-Variant.patch, 
> AVRO-1584.1.patch, AVRO-1584.patch
>
>
> The Json output of java generated code doesn't correctly encode byte arrays.
> Using this simple schema : 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "ByteArrayEncoding",
>  "fields": [ {"name": "data", "type": "bytes"} ]
> }
> The toString()  
>   System.out.println(new ByteArrayEncoding(ByteBuffer.wrap(new 
> byte[]{0,31,65,66,67,(byte)255,(byte)182})));
> Returns raw bytes to string in the json :
> {"data": {"bytes": "  ABC??"}}
> As a byte array is not tied to be a valid string, it should be converted back 
> and forth to Base64 like other Json implementations : 
> {"data": {"bytes": "AB9BQkP/tg=="}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1584) Json output doesn't generate base64 for byte arrays

2015-12-14 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057087#comment-15057087
 ] 

Ryan Blue commented on AVRO-1584:
-

Committed. Thanks for taking a look, Doug!

> Json output doesn't generate base64 for byte arrays
> ---
>
> Key: AVRO-1584
> URL: https://issues.apache.org/jira/browse/AVRO-1584
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Pure java.
>Reporter: Christophe Lorenz
> Attachments: AVRO-1584-Jackson-Base64-Default-Variant.patch, 
> AVRO-1584.1.patch, AVRO-1584.patch
>
>
> The Json output of java generated code doesn't correctly encode byte arrays.
> Using this simple schema : 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "ByteArrayEncoding",
>  "fields": [ {"name": "data", "type": "bytes"} ]
> }
> The toString()  
>   System.out.println(new ByteArrayEncoding(ByteBuffer.wrap(new 
> byte[]{0,31,65,66,67,(byte)255,(byte)182})));
> Returns raw bytes to string in the json :
> {"data": {"bytes": "  ABC??"}}
> As a byte array is not tied to be a valid string, it should be converted back 
> and forth to Base64 like other Json implementations : 
> {"data": {"bytes": "AB9BQkP/tg=="}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1584) Json output doesn't generate base64 for byte arrays

2015-12-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057086#comment-15057086
 ] 

ASF subversion and git services commented on AVRO-1584:
---

Commit 1720055 from [~b...@cloudera.com] in branch 'avro/trunk'
[ https://svn.apache.org/r1720055 ]

AVRO-1584: Java: Escape characters not allowed in JSON in toString.

>From the JSON spec: "All Unicode characters may be placed within the
quotation marks except for the characters that must be escaped:
quotation mark, reverse solidus, and the control characters (U+
through U+001F)."

This uses the existing string escape function.

> Json output doesn't generate base64 for byte arrays
> ---
>
> Key: AVRO-1584
> URL: https://issues.apache.org/jira/browse/AVRO-1584
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Pure java.
>Reporter: Christophe Lorenz
> Attachments: AVRO-1584-Jackson-Base64-Default-Variant.patch, 
> AVRO-1584.1.patch, AVRO-1584.patch
>
>
> The Json output of java generated code doesn't correctly encode byte arrays.
> Using this simple schema : 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "ByteArrayEncoding",
>  "fields": [ {"name": "data", "type": "bytes"} ]
> }
> The toString()  
>   System.out.println(new ByteArrayEncoding(ByteBuffer.wrap(new 
> byte[]{0,31,65,66,67,(byte)255,(byte)182})));
> Returns raw bytes to string in the json :
> {"data": {"bytes": "  ABC??"}}
> As a byte array is not tied to be a valid string, it should be converted back 
> and forth to Base64 like other Json implementations : 
> {"data": {"bytes": "AB9BQkP/tg=="}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1584) Json output doesn't generate base64 for byte arrays

2015-12-14 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057097#comment-15057097
 ] 

Ryan Blue commented on AVRO-1584:
-

[~lemieud], thank you for your work to get this addressed! I think that the fix 
we ended up with should fix the problem you were seeing since the control 
characters will be properly escaped. If moving to base64 is important to you as 
well, then I think the right way forward is to help standardize a different 
JSON encoding, like Doug suggested for the long term. For now, I'm going to 
mark this issue resolved since we've decided the way forward in the short term.

> Json output doesn't generate base64 for byte arrays
> ---
>
> Key: AVRO-1584
> URL: https://issues.apache.org/jira/browse/AVRO-1584
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Pure java.
>Reporter: Christophe Lorenz
> Attachments: AVRO-1584-Jackson-Base64-Default-Variant.patch, 
> AVRO-1584.1.patch, AVRO-1584.patch
>
>
> The Json output of java generated code doesn't correctly encode byte arrays.
> Using this simple schema : 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "ByteArrayEncoding",
>  "fields": [ {"name": "data", "type": "bytes"} ]
> }
> The toString()  
>   System.out.println(new ByteArrayEncoding(ByteBuffer.wrap(new 
> byte[]{0,31,65,66,67,(byte)255,(byte)182})));
> Returns raw bytes to string in the json :
> {"data": {"bytes": "  ABC??"}}
> As a byte array is not tied to be a valid string, it should be converted back 
> and forth to Base64 like other Json implementations : 
> {"data": {"bytes": "AB9BQkP/tg=="}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1728) Update LICENSE and NOTICE files included in Java binaries

2015-12-14 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057118#comment-15057118
 ] 

Ryan Blue commented on AVRO-1728:
-

[~busbey], could you have a look at this? I think we are close to getting the 
license docs finished. Thanks!

> Update LICENSE and NOTICE files included in Java binaries
> -
>
> Key: AVRO-1728
> URL: https://issues.apache.org/jira/browse/AVRO-1728
> Project: Avro
>  Issue Type: Sub-task
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 1.8.0
>
> Attachments: AVRO-1728.1.patch, AVRO-1728.2.patch, AVRO-1728.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1584) Json output doesn't generate base64 for byte arrays

2015-12-14 Thread Ryan Blue (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Blue updated AVRO-1584:

Attachment: AVRO-1584.1.patch

> Json output doesn't generate base64 for byte arrays
> ---
>
> Key: AVRO-1584
> URL: https://issues.apache.org/jira/browse/AVRO-1584
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Pure java.
>Reporter: Christophe Lorenz
> Attachments: AVRO-1584-Jackson-Base64-Default-Variant.patch, 
> AVRO-1584.1.patch, AVRO-1584.patch
>
>
> The Json output of java generated code doesn't correctly encode byte arrays.
> Using this simple schema : 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "ByteArrayEncoding",
>  "fields": [ {"name": "data", "type": "bytes"} ]
> }
> The toString()  
>   System.out.println(new ByteArrayEncoding(ByteBuffer.wrap(new 
> byte[]{0,31,65,66,67,(byte)255,(byte)182})));
> Returns raw bytes to string in the json :
> {"data": {"bytes": "  ABC??"}}
> As a byte array is not tied to be a valid string, it should be converted back 
> and forth to Base64 like other Json implementations : 
> {"data": {"bytes": "AB9BQkP/tg=="}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1584) Json output doesn't generate base64 for byte arrays

2015-12-14 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057042#comment-15057042
 ] 

Ryan Blue commented on AVRO-1584:
-

Thanks for the context, Doug.

I agree that we shouldn't change the specified JSON encoding or toString 
behavior. I think we have some flexibility with toString, since it was intended 
for debugging (so it isn't used to encode default values) and doesn't encode 
either bytes or fixed as expected. For bytes, an extra object layer is added 
and fixed is encoded as an array of integers. I think that makes it unlikely 
that anyone would use it to serialize data as JSON, but I have no problem being 
cautious and not breaking anything unless we have a plan for what toString 
should produce.

I'm attaching a patch that uses the fix from AVRO-713 to fix just the escape 
problem. It also adds tests to validate the current behavior of toString for 
bytes and fixed.

> Json output doesn't generate base64 for byte arrays
> ---
>
> Key: AVRO-1584
> URL: https://issues.apache.org/jira/browse/AVRO-1584
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Pure java.
>Reporter: Christophe Lorenz
> Attachments: AVRO-1584-Jackson-Base64-Default-Variant.patch, 
> AVRO-1584.patch
>
>
> The Json output of java generated code doesn't correctly encode byte arrays.
> Using this simple schema : 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "ByteArrayEncoding",
>  "fields": [ {"name": "data", "type": "bytes"} ]
> }
> The toString()  
>   System.out.println(new ByteArrayEncoding(ByteBuffer.wrap(new 
> byte[]{0,31,65,66,67,(byte)255,(byte)182})));
> Returns raw bytes to string in the json :
> {"data": {"bytes": "  ABC??"}}
> As a byte array is not tied to be a valid string, it should be converted back 
> and forth to Base64 like other Json implementations : 
> {"data": {"bytes": "AB9BQkP/tg=="}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1749) Maven plugin goal: automatic schemas (.avsc) generation from Java classes (.java)

2015-12-14 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057113#comment-15057113
 ] 

Ryan Blue commented on AVRO-1749:
-

Good idea, [~embs]. Would you like to work on this? How can we help?

> Maven plugin goal: automatic schemas (.avsc) generation from Java classes 
> (.java)
> -
>
> Key: AVRO-1749
> URL: https://issues.apache.org/jira/browse/AVRO-1749
> Project: Avro
>  Issue Type: New Feature
>  Components: java
>Reporter: Matheus Santana
>Priority: Minor
>
> Current maven plugin includes goals for generating Java code (classes and 
> interfaces) from IDL (.avdl files) and Avro protocol / schemas definitions 
> (.avpr / .avsc).
> It would be nice to provide a goal for automatic [induced schemas from Java 
> code|https://avro.apache.org/docs/current/api/java/org/apache/avro/tool/InduceSchemaTool.html].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1725) Enum schema exhibits same restriction to enum symbols as to names

2015-12-14 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057121#comment-15057121
 ] 

Ryan Blue commented on AVRO-1725:
-

[~whale2], I'm not sure I understand the problem here. Is the current 
validation wrong?

> Enum schema exhibits same restriction to enum symbols as to names
> -
>
> Key: AVRO-1725
> URL: https://issues.apache.org/jira/browse/AVRO-1725
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
>Reporter: Nikita Makeev
>
> EnumSchema class in org.apache.avro.Schema has the following code:
> for (String symbol : symbols)
> if (ordinals.put(validateName(symbol), i++) != null)
> which validates enum symbols using validateName() which makes impossible to 
> use symbols that are not conforming to standard for real names. 
> That prohibits using of symbols like "" (empty string) or anything starting 
> with number which does not seem to be intended.
> I guess this place requires either some another type of validation or no 
> validation at all. Can provide a patch for both cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[DISCUSS] Migrate to Java 7

2015-12-14 Thread Ryan Blue
I just noticed that our tests are still compiling and running with Java 
6. Java 7 is already end-of-life (public patches at least), so I think 
it is reasonable to start migrating. Is everyone okay with updating the 
builds and dropping support for Java 6?


rb

--
Ryan Blue


[jira] [Commented] (AVRO-1738) add java tool for outputting schema fingerprints

2015-12-14 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057117#comment-15057117
 ] 

Ryan Blue commented on AVRO-1738:
-

[~busbey], would you like some help getting this patch finished? This could be 
a good ramp-up issue if you don't have time.

> add java tool for outputting schema fingerprints
> 
>
> Key: AVRO-1738
> URL: https://issues.apache.org/jira/browse/AVRO-1738
> Project: Avro
>  Issue Type: New Feature
>  Components: java
>Reporter: Sean Busbey
>Assignee: Sean Busbey
> Fix For: 1.7.8, 1.8.0
>
> Attachments: AVRO-1738.1.patch
>
>
> over in AVRO-1694 I wanted to quickly check that the Java library came up 
> with the same md5/sha fingerprint for some shcemas that the proposed Ruby 
> implementation does.
> I noticed we don't have a tool that exposes the functionality yet, which 
> seems like a commonly useful thing to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : AVRO-python #581

2015-12-14 Thread Apache Jenkins Server
See 



buildbot failure in ASF Buildbot on avro-java6-ubuntu

2015-12-14 Thread buildbot
The Buildbot has detected a new failure on builder avro-java6-ubuntu while 
building ASF Buildbot. Full details are available at:
http://ci.apache.org/builders/avro-java6-ubuntu/builds/16

Buildbot URL: http://ci.apache.org/

Buildslave for this Build: bb-vm_ubuntu

Build Reason: The AnyBranchScheduler scheduler named 'AvroJava' triggered this 
build
Build Source Stamp: [branch avro/trunk] 1720055
Blamelist: blue

BUILD FAILED: failed test

Sincerely,
 -The Buildbot





[jira] [Created] (AVRO-1768) stdin support for getschema

2015-12-14 Thread Bennie Schut (JIRA)
Bennie Schut created AVRO-1768:
--

 Summary: stdin support for getschema
 Key: AVRO-1768
 URL: https://issues.apache.org/jira/browse/AVRO-1768
 Project: Avro
  Issue Type: Improvement
Affects Versions: 1.7.7
Reporter: Bennie Schut
Priority: Minor


It would be nice to support reading from stdin on getschema calls so you don't 
need a local file first.
Somewhat similar to AVRO-1583.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1768) stdin support for getschema

2015-12-14 Thread Bennie Schut (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056087#comment-15056087
 ] 

Bennie Schut commented on AVRO-1768:


Code is in: https://github.com/apache/avro/pull/65 but I don't mind making the 
svn version if that's preferred.

> stdin support for getschema
> ---
>
> Key: AVRO-1768
> URL: https://issues.apache.org/jira/browse/AVRO-1768
> Project: Avro
>  Issue Type: Improvement
>Affects Versions: 1.7.7
>Reporter: Bennie Schut
>Priority: Minor
>
> It would be nice to support reading from stdin on getschema calls so you 
> don't need a local file first.
> Somewhat similar to AVRO-1583.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1584) Json output doesn't generate base64 for byte arrays

2015-12-14 Thread David Lemieux (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057312#comment-15057312
 ] 

David Lemieux commented on AVRO-1584:
-

[~rdblue] My pleasure. I agree that is should fix my problem. Thanks

> Json output doesn't generate base64 for byte arrays
> ---
>
> Key: AVRO-1584
> URL: https://issues.apache.org/jira/browse/AVRO-1584
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Pure java.
>Reporter: Christophe Lorenz
> Attachments: AVRO-1584-Jackson-Base64-Default-Variant.patch, 
> AVRO-1584.1.patch, AVRO-1584.patch
>
>
> The Json output of java generated code doesn't correctly encode byte arrays.
> Using this simple schema : 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "ByteArrayEncoding",
>  "fields": [ {"name": "data", "type": "bytes"} ]
> }
> The toString()  
>   System.out.println(new ByteArrayEncoding(ByteBuffer.wrap(new 
> byte[]{0,31,65,66,67,(byte)255,(byte)182})));
> Returns raw bytes to string in the json :
> {"data": {"bytes": "  ABC??"}}
> As a byte array is not tied to be a valid string, it should be converted back 
> and forth to Base64 like other Json implementations : 
> {"data": {"bytes": "AB9BQkP/tg=="}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] avro pull request: stdin support for getschema

2015-12-14 Thread bennies
GitHub user bennies opened a pull request:

https://github.com/apache/avro/pull/65

stdin support for getschema

It would be nice to support reading from stdin on getschema calls so you 
don't need a local file first.
Somewhat similar to AVRO-1583. Not sure if it's preferred to create a jira 
for this?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bennies/avro trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/avro/pull/65.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #65


commit 60bc6cc91bcdde63eb2c21436bd456592e3cc838
Author: Bennie Schut 
Date:   2015-12-14T12:47:18Z

stdin support for getschema




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (AVRO-1583) Add stdin support to tojson

2015-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055942#comment-15055942
 ] 

ASF GitHub Bot commented on AVRO-1583:
--

GitHub user bennies opened a pull request:

https://github.com/apache/avro/pull/65

stdin support for getschema

It would be nice to support reading from stdin on getschema calls so you 
don't need a local file first.
Somewhat similar to AVRO-1583. Not sure if it's preferred to create a jira 
for this?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bennies/avro trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/avro/pull/65.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #65


commit 60bc6cc91bcdde63eb2c21436bd456592e3cc838
Author: Bennie Schut 
Date:   2015-12-14T12:47:18Z

stdin support for getschema




> Add stdin support to tojson
> ---
>
> Key: AVRO-1583
> URL: https://issues.apache.org/jira/browse/AVRO-1583
> Project: Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.7.7
>Reporter: Clément MAHTIEU
>Assignee: Clément MAHTIEU
> Fix For: 1.8.0
>
> Attachments: AVRO-1583.patch
>
>
> Unlike most of the avro tools tojson does not currently support reading from 
> stdin. Adding this support would be quite conveniant. For example a pipe of 
> concat and tojson is much easier to use that having to write a shell script 
> to create a local file.
> Source code said that stdin is not supported because we need a seekable 
> stream, however I believe this comment is outdated since it just works and 
> other tools have no issue dealing with stdin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1584) Json output doesn't generate base64 for byte arrays

2015-12-14 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056779#comment-15056779
 ] 

Doug Cutting commented on AVRO-1584:


Ryan, I agree this is a bug in the current implementation.  According to 
section RFC 4627, control characters must be escaped.
bq. All Unicode characters may be placed within the quotation marks except for 
the characters that must be escaped: quotation mark, reverse solidus, and the 
control characters (U+ through U+001F).
I note that this was fixed for strings in AVRO-713 and we can probably share 
this logic.

The difference between toString() JSON and Avro's JSON data encoding is 
longstanding and primarily around the encoding of unions.  For full read/write 
fidelity, many union values must be tagged with their type, so that's what the 
JSON encoding requires.  The toString() encoding was not intended for data 
fidelity but for debugging, so a simpler version was implemented.  (It actually 
pre-dates the specification of the JSON encoding.)  It so happens that default 
values in schemas do not need to be tagged, so the toString() format is 
identical to the default-value format.

However there are frequent requests for a reader that accepts such an untagged 
format, for interaction with other JSON-generating software.  In retrospect, 
the JSON encoding should perhaps not require tagging for unions with null or 
unions between a primitive and a non-primitive, i.e., only tag unions when it's 
required.  We instead opted for simplicity of specification implementation, to 
ease interoperability between various Avro implementations, when perhaps in 
this case we should have optimized for ease of interoperability with non-Avro 
producers and consumers of JSON.

So long-term we might add an encoder/decoder that doesn't handle unions at all 
or that handles them more parsimoniously, then perhaps implement default values 
and toString() using this encoding.  But I don't think we should alter the 
currently specified JSON encoding, nor change the default or toString() format.



> Json output doesn't generate base64 for byte arrays
> ---
>
> Key: AVRO-1584
> URL: https://issues.apache.org/jira/browse/AVRO-1584
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Pure java.
>Reporter: Christophe Lorenz
> Attachments: AVRO-1584-Jackson-Base64-Default-Variant.patch, 
> AVRO-1584.patch
>
>
> The Json output of java generated code doesn't correctly encode byte arrays.
> Using this simple schema : 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "ByteArrayEncoding",
>  "fields": [ {"name": "data", "type": "bytes"} ]
> }
> The toString()  
>   System.out.println(new ByteArrayEncoding(ByteBuffer.wrap(new 
> byte[]{0,31,65,66,67,(byte)255,(byte)182})));
> Returns raw bytes to string in the json :
> {"data": {"bytes": "  ABC??"}}
> As a byte array is not tied to be a valid string, it should be converted back 
> and forth to Base64 like other Json implementations : 
> {"data": {"bytes": "AB9BQkP/tg=="}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1584) Json output doesn't generate base64 for byte arrays

2015-12-14 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056532#comment-15056532
 ] 

Doug Cutting commented on AVRO-1584:


The problem you originally cite (question marks in output) is caused by using a 
non-UTF8 encoding when printing the value of toString(), not with that value 
itself.  So there's not actually a bug here.  The string produced by toString() 
loses no information.  Rather, you seek either a (incompatible) change or a new 
feature.

Changing the format of toString() for binary values incompatibly to base64 
seems likely to break applications, e.g. those that that use toString() to 
supply default values to the schema builder API.  I question that this is of 
sufficient benefit to be worth doing even in a release that permits 
incompatibilities.  There is no perfect string format for binary values.  The 
one currently used here (and by the spec for default values) makes textual 
values more legible, while base64 makes non-textual values more tolerant of 
non-UTF8-safe i/o.

Perhaps we should instead add a flag that one can set to change 
GenericData#toString() so that it generates base64?  We should also certainly 
add some tests for the current format if there are none.

> Json output doesn't generate base64 for byte arrays
> ---
>
> Key: AVRO-1584
> URL: https://issues.apache.org/jira/browse/AVRO-1584
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Pure java.
>Reporter: Christophe Lorenz
> Attachments: AVRO-1584-Jackson-Base64-Default-Variant.patch, 
> AVRO-1584.patch
>
>
> The Json output of java generated code doesn't correctly encode byte arrays.
> Using this simple schema : 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "ByteArrayEncoding",
>  "fields": [ {"name": "data", "type": "bytes"} ]
> }
> The toString()  
>   System.out.println(new ByteArrayEncoding(ByteBuffer.wrap(new 
> byte[]{0,31,65,66,67,(byte)255,(byte)182})));
> Returns raw bytes to string in the json :
> {"data": {"bytes": "  ABC??"}}
> As a byte array is not tied to be a valid string, it should be converted back 
> and forth to Base64 like other Json implementations : 
> {"data": {"bytes": "AB9BQkP/tg=="}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1746) JSON encoder and decoder for Ruby

2015-12-14 Thread Saroj Yadav (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056447#comment-15056447
 ] 

Saroj Yadav commented on AVRO-1746:
---

[~b...@cloudera.com] just opened [this 
PR|https://github.com/apache/avro/pull/64] as per our conversation. Please let 
me know how it looks. I am hoping to wrap up things before Holidays. 

> JSON encoder and decoder for Ruby
> -
>
> Key: AVRO-1746
> URL: https://issues.apache.org/jira/browse/AVRO-1746
> Project: Avro
>  Issue Type: New Feature
>  Components: ruby
>Reporter: Saroj Yadav
>
> Hi,
> I am thinking of adding a JSON encoder and decoder for Ruby. What do you 
> think?
> Best,
> Saroj



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AVRO-1663) C Library does not handle enum's namespace

2015-12-14 Thread Martin Kleppmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/AVRO-1663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Kleppmann updated AVRO-1663:
---
Attachment: AVRO-1663-2.patch

The previous patch doesn't fully address the problem (for example, it does not 
actually save the namespace in the enum schema struct, so when it comes to 
writing the schema out to JSON again, the namespace will be omitted).

I'm attaching a more comprehensive patch. It also deals with "fixed" types 
(which, being named types, should also be namespace-aware), it deals with 
fully-qualified names appearing in the "name" field (as permitted by the spec), 
and it correctly round-trips to and from JSON.

In this patch I've made some public API changes: added a function 
{{avro_schema_namespace}} which returns the namespace of a named schema (as 
suggested in AVRO-1565), and added a namespace parameter to the 
{{avro_schema_fixed}} and {{avro_schema_enum}} constructors. What's the policy 
for such API changes in the C implementation? Rather than changing the existing 
function signature, would it be better to add alternative versions that take a 
namespace argument (e.g. {{avro_schema_fixed_ns}} and {{avro_schema_enum_ns}})?

> C Library does not handle enum's namespace
> --
>
> Key: AVRO-1663
> URL: https://issues.apache.org/jira/browse/AVRO-1663
> Project: Avro
>  Issue Type: Bug
>  Components: c
>Affects Versions: 1.7.7
>Reporter: Thomas Sanchez
> Attachments: AVRO-1663-2.patch, AVRO-1663.patch
>
>
> {code}{
>   "type": "record",
>   "name": "EventName",
>   "namespace": "com.company.avro.schemas",
>   "fields": [
> {"name": "eventname_model",
>  "type": {
>"type": "enum",
>"namespace": "com.company.models",
>"name": "EventName",
>"symbols": [""]
>  }
> }
>  ]
> }
> {code}
> Such a schema is perfectly valid but the C library does no handle it because 
> it does not parse the namespace field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AVRO-1584) Json output doesn't generate base64 for byte arrays

2015-12-14 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/AVRO-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056701#comment-15056701
 ] 

Ryan Blue commented on AVRO-1584:
-

It looks like the conversion used for default values is independent of 
toString. Callers can pass either a JsonNode, which bypasses the problem, or an 
object that gets [converted in 
JacksonUtils|https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/util/internal/JacksonUtils.java#L73].
 That converts a byte array to a string using ISO-8859-1, which correctly 
implements the spec. When the JSON is written, the characters that aren't 
allowed in JSON strings are escaped by the generator. Changing the output of 
toString won't break the case that Doug mentions, but I think it is a fair 
point that changing what is currently produced could break applications.

However, the JSON currently produced by toString is broken because it doesn't 
convert control characters to escape sequences (0x0a to \n). We could safely 
fix that problem without moving to base64 and I think at a minimum we should do 
that.

But this still leaves a problem: what do we do about toString not conforming to 
the JSON required by the Avro spec?

> Json output doesn't generate base64 for byte arrays
> ---
>
> Key: AVRO-1584
> URL: https://issues.apache.org/jira/browse/AVRO-1584
> Project: Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
> Environment: Pure java.
>Reporter: Christophe Lorenz
> Attachments: AVRO-1584-Jackson-Base64-Default-Variant.patch, 
> AVRO-1584.patch
>
>
> The Json output of java generated code doesn't correctly encode byte arrays.
> Using this simple schema : 
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "ByteArrayEncoding",
>  "fields": [ {"name": "data", "type": "bytes"} ]
> }
> The toString()  
>   System.out.println(new ByteArrayEncoding(ByteBuffer.wrap(new 
> byte[]{0,31,65,66,67,(byte)255,(byte)182})));
> Returns raw bytes to string in the json :
> {"data": {"bytes": "  ABC??"}}
> As a byte array is not tied to be a valid string, it should be converted back 
> and forth to Base64 like other Json implementations : 
> {"data": {"bytes": "AB9BQkP/tg=="}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)