[jira] [Resolved] (AVRO-1190) C++ json parser fails to decode multibyte unicode code points

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. resolved AVRO-1190.
---
   Resolution: Fixed
Fix Version/s: 1.9.0

Merged the Pull Request

> C++ json parser fails to decode multibyte unicode code points
> -
>
> Key: AVRO-1190
> URL: https://issues.apache.org/jira/browse/AVRO-1190
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.0
>Reporter: Keh-Li Sheng
>Priority: Major
> Fix For: 1.9.0
>
>
> The parser in JsonIO.cc does not handle decoding a multibyte unicode 
> character into any kind of valid character encoding for a std::string in c++. 
> The following snippet from JsonParser::tryString() has several flaws:
> 1. sv is a std::string used as a vector, where each unit is a char
> 2. a single unicode hex quad encoded in JSON can represent a 16-bit value
> 3. a unicode hex quad can represent a "high surrogate" character meaning that 
> it must be combined with the following quad to derive the full unicode code 
> point
> 4. \U is not a valid unicode escape for JSON (see 
> http://www.ietf.org/rfc/rfc4627.txt)
> {code:title=JsonIO.cc}
> case 'u':
> case 'U':
> {
> unsigned int n = 0;
> char e[4];
> in_.readBytes(reinterpret_cast(e), 4);
> for (int i = 0; i < 4; i++) {
> n *= 16;
> char c = e[i];
> if (isdigit(c)) {
> n += c - '0';
> } else if (c >= 'a' && c <= 'f') {
> n += c - 'a' + 10;
> } else if (c >= 'A' && c <= 'F') {
> n += c - 'A' + 10;
> } else {
> throw unexpected(c);
> }
> }
> sv.push_back(n);
> }
> {code}
> This code loop creates a temporary int then decodes the quad into it and then 
> simply pushes the int (which may be a 16-bit value) onto the std::string. 
> This essentially means that the JSON parser does not decode any unicode 
> characters. For example, this JSON string:
> {noformat}
> "Dress up if you dare! Free cover all night! \uD83C\uDF83\uD83D\uDC7B"
> {noformat}
> results in a decoded byte sequence for the last 4 characters:
> {noformat}
> 3C 83 3D 7B 00
> {noformat}
> where you can see that it simply drops the high order bytes. In this 
> particular example, \uD83C is a high-surrogate character which requires some 
> additional handling. I am not sure what users of the c++ library expect the 
> encoding to be, but given that we are working with json and given that avro 
> c++ uses char instead of wchar, I would assume users would expect a UTF-8 
> encoded string. However, I could be wrong. There are many examples of 
> decoders that handle this string properly - I found this one helpful while 
> implementing a fix: http://rishida.net/tools/conversion/
> For basics on UTF-8 http://www.utf-8.com/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AVRO-1190) C++ json parser fails to decode multibyte unicode code points

2018-12-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730900#comment-16730900
 ] 

ASF subversion and git services commented on AVRO-1190:
---

Commit 8f94f5647b6351c219fa105f37cd01f156427f71 in avro's branch 
refs/heads/master from Thiruvalluvan M. G.
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=8f94f56 ]

Merge pull request #417 from thiru-apache/AVRO-1190

UTF-8 support for JSON in C++

> C++ json parser fails to decode multibyte unicode code points
> -
>
> Key: AVRO-1190
> URL: https://issues.apache.org/jira/browse/AVRO-1190
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Affects Versions: 1.7.0
>Reporter: Keh-Li Sheng
>Priority: Major
>
> The parser in JsonIO.cc does not handle decoding a multibyte unicode 
> character into any kind of valid character encoding for a std::string in c++. 
> The following snippet from JsonParser::tryString() has several flaws:
> 1. sv is a std::string used as a vector, where each unit is a char
> 2. a single unicode hex quad encoded in JSON can represent a 16-bit value
> 3. a unicode hex quad can represent a "high surrogate" character meaning that 
> it must be combined with the following quad to derive the full unicode code 
> point
> 4. \U is not a valid unicode escape for JSON (see 
> http://www.ietf.org/rfc/rfc4627.txt)
> {code:title=JsonIO.cc}
> case 'u':
> case 'U':
> {
> unsigned int n = 0;
> char e[4];
> in_.readBytes(reinterpret_cast(e), 4);
> for (int i = 0; i < 4; i++) {
> n *= 16;
> char c = e[i];
> if (isdigit(c)) {
> n += c - '0';
> } else if (c >= 'a' && c <= 'f') {
> n += c - 'a' + 10;
> } else if (c >= 'A' && c <= 'F') {
> n += c - 'A' + 10;
> } else {
> throw unexpected(c);
> }
> }
> sv.push_back(n);
> }
> {code}
> This code loop creates a temporary int then decodes the quad into it and then 
> simply pushes the int (which may be a 16-bit value) onto the std::string. 
> This essentially means that the JSON parser does not decode any unicode 
> characters. For example, this JSON string:
> {noformat}
> "Dress up if you dare! Free cover all night! \uD83C\uDF83\uD83D\uDC7B"
> {noformat}
> results in a decoded byte sequence for the last 4 characters:
> {noformat}
> 3C 83 3D 7B 00
> {noformat}
> where you can see that it simply drops the high order bytes. In this 
> particular example, \uD83C is a high-surrogate character which requires some 
> additional handling. I am not sure what users of the c++ library expect the 
> encoding to be, but given that we are working with json and given that avro 
> c++ uses char instead of wchar, I would assume users would expect a UTF-8 
> encoded string. However, I could be wrong. There are many examples of 
> decoders that handle this string properly - I found this one helpful while 
> implementing a fix: http://rishida.net/tools/conversion/
> For basics on UTF-8 http://www.utf-8.com/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1137) Could we have a folder with examples/samples in the source code

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1137:
--
Component/s: build

> Could we have a folder with examples/samples in the source code
> ---
>
> Key: AVRO-1137
> URL: https://issues.apache.org/jira/browse/AVRO-1137
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: build
> Environment: all
>Reporter: Ajo Fod
>Priority: Major
> Attachments: avro.helloworld.zip
>
>
> I don't know if there are a collection of examples of usages of Avro anywhere 
> (something like jfreechart has). I've recently posted on stack overflow an 
> example of a problem I ran into:
> http://stackoverflow.com/questions/11866466/using-apache-avro-reflect



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1774) Update documentation with instructions/examples for using different code generation template

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1774:
--
Component/s: doc

> Update documentation with instructions/examples for using different code 
> generation template
> 
>
> Key: AVRO-1774
> URL: https://issues.apache.org/jira/browse/AVRO-1774
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: doc
>Affects Versions: 1.7.7
>Reporter: Jake Robb
>Priority: Major
>
> AVRO-1209 added a template for generating immutable classes. I can't find 
> anything in the Avro docs that tells me how to use it, and it is not obvious 
> to me (as a total n00b to Avro) how to do so. It seems like I'm supposed to 
> specify a different {{templateDirectory}} to avro-maven-plugin, but I'm not 
> sure what value to provide there.
> Hopefully this is an easy one. I'll keep trying to figure it out, and if I do 
> so before anybody beats me to it, I'll happily write some instructions in a 
> comment here so that someone with edit privs to the docs can just paste it in 
> (or should it be in the Wiki section of the docs? It's unclear why there are 
> Wiki and non-Wiki docs, or what should go in which part). :) 
> Thanks!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1269) AVRO is converting ORACLE,Netezza,Teradata decmials & long integers to Strings.

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1269:
--
Component/s: java

> AVRO is converting ORACLE,Netezza,Teradata decmials & long integers to 
> Strings.
> ---
>
> Key: AVRO-1269
> URL: https://issues.apache.org/jira/browse/AVRO-1269
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.1
>Reporter: Prasad Dasari
>Priority: Major
>
> I  tried to sqoop ORALCE,NETEZZA,TERADATA tables with AVRO foramt using plain 
> JDBC (without using Cloudera connectors). I can see DECIMAL & NUMERIC data 
> types are being converted to AVRO Strings.
> Oracle --NUMBER &  INTEGER  data types  are being converted to 
> AVRO String format.
> NETEZZA--   DECIMAL,NUMERIC data types are converted to AVRO String 
> format.
> Teradata   --  DECIMAL AND LONG data types are converted to AVRO String 
> format.
>  When i tried with map-columns to BigDecimal,BigInteger i can see AVRO does 
> not support BigDecimal error message.
> Thanks,
> Prasad Dasari.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1180) Broken links on Code Review Checklist page on confluence

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1180:
--
Component/s: build

> Broken links on Code Review Checklist page on confluence
> 
>
> Key: AVRO-1180
> URL: https://issues.apache.org/jira/browse/AVRO-1180
> Project: Apache Avro
>  Issue Type: Task
>  Components: build
>Reporter: Pradeep Gollakota
>Priority: Trivial
>
> The [Code Review 
> Checklist|https://cwiki.apache.org/confluence/display/AVRO/Code+Review+Checklist]
>  has two broken links.
> The link referencing Sun's code conventions points to 
> http://java.sun.com/docs/codeconv/
> This link should be updated to (I'm guessing) 
> http://www.oracle.com/technetwork/java/javase/documentation/codeconvtoc-136057.html
> The link referencing Log4j Level's is pointing to 
> http://logging.apache.org/log4j/docs/api/org/apache/log4j/Level.html
> This should be updated to 
> https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1025) migrate website & dist to svnpubsub

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1025:
--
Component/s: build

> migrate website & dist to svnpubsub
> ---
>
> Key: AVRO-1025
> URL: https://issues.apache.org/jira/browse/AVRO-1025
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: build
>Reporter: Doug Cutting
>Assignee: Doug Cutting
>Priority: Major
>
> ASF infrastructure has requested that all projects migrate to svnpubsub for 
> their websites and release distributions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1557) downloads AVRO from the website

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1557:
--
Component/s: build

> downloads AVRO from the website 
> 
>
> Key: AVRO-1557
> URL: https://issues.apache.org/jira/browse/AVRO-1557
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: build
> Environment: web site
>Reporter: evgeny
>Priority: Major
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Hi , 
> I think we have a little problem with the main Avro web site.
> Usually, groups put the link to the  last build version and current stable 
> version  under static URL, allows to automatic tools and peoples download it 
> easily .
> I believe many administrators will appreciate it .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2030) Fix broken URL to "this book chapter" about Rabin fingerprints in 1.8.1 spec

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2030:
--
Component/s: build

> Fix broken URL to "this book chapter" about Rabin fingerprints in 1.8.1 spec
> 
>
> Key: AVRO-2030
> URL: https://issues.apache.org/jira/browse/AVRO-2030
> Project: Apache Avro
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.8.1
>Reporter: CJ Gaconnet
>Priority: Trivial
>
> The [1.8.1 
> specification|https://avro.apache.org/docs/current/spec.html#N1088B] has a 
> sentence saying:
> bq. Readers interested in the mathematics behind this algorithm may want to 
> read [this book chapter|http://www.scribd.com/fb-6001967/d/84795-Crc].
> The URL http://www.scribd.com/fb-6001967/d/84795-Crc now serves up a 404. 
> Does anyone know what book the link was pointing to?
> Searching around leads me to think it was probably pointing to "14-2 Theory" 
> of _Hacker's Delight_ by Henry S. Warren. If so, it would be nice to either 
> update or remove the hyperlink and have the text cite the book and chapter by 
> name so that interested readers can still find it even if an updated link 
> were to go away.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1059) Apache project branding requirements: DOAP file [PATCH]

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1059:
--
Component/s: build

> Apache project branding requirements: DOAP file [PATCH]
> ---
>
> Key: AVRO-1059
> URL: https://issues.apache.org/jira/browse/AVRO-1059
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: build
>Reporter: Shane Curcuru
>Priority: Major
> Attachments: doap_Avro.rdf
>
>
> Attached.  Re: http://www.apache.org/foundation/marks/pmcs
> See Also: http://projects.apache.org/create.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1924) Variable named 'date' in IDL

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1924:
--
Component/s: java

> Variable named 'date' in IDL
> 
>
> Key: AVRO-1924
> URL: https://issues.apache.org/jira/browse/AVRO-1924
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Niels Basjes
>Assignee: Ryan Blue
>Priority: Critical
>
> I was compiling Apache Parquet and found that the switch from Avro 1.8.0 to 
> 1.8.1 broke their build.
> The error: {code}
> [ERROR] Failed to execute goal 
> org.apache.avro:avro-maven-plugin:1.8.1:idl-protocol (schemas) ... 
> org.apache.avro.compiler.idl.ParseException: Encountered " "date" "date "" at 
> line 23, column 14.
> [ERROR] Was expecting one of:
> [ERROR]  ...
> [ERROR] "@" ...
> [ERROR] "`" ...
> [ERROR] -> [Help 1]
> {code}
> As it turns out they have a test idl that contains this:
> {code}
> @namespace("org.apache.parquet.avro")
> protocol Cars {
> record Service {
> long date;
> }
> }
> {code}
> And this change AVRO-1684 turned the word 'date' into something different for 
> the idl compiler.
> So changing the word 'date' into something else fixes the problem. 
> Yet I think this is an undesirable effect for end user applications.
> [~rdblue]: I assigned this to you implemented the mentioned change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-425) Would be very helpful if there was a wireshark "plugin" for decoding the binary wireformat AVRO uses.

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-425:
-
Component/s: misc

> Would be very helpful if there was a wireshark "plugin" for decoding the 
> binary wireformat AVRO uses.
> -
>
> Key: AVRO-425
> URL: https://issues.apache.org/jira/browse/AVRO-425
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: misc
> Environment: Wireshark
>Reporter: Mark Wolfe
>Priority: Major
>  Labels: avro, wireshark
>
> This would be of great assistance to developers and network engineers when 
> debugging issues in production environments using AVRO.
> It would certainly make adoption of this format easier for new developers in 
> the longer term.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2117) Overall cleanup of code

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2117:
--
Component/s: misc

> Overall cleanup of code
> ---
>
> Key: AVRO-2117
> URL: https://issues.apache.org/jira/browse/AVRO-2117
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: misc
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>Priority: Major
>
> When opening Avro in my IDE I see lots of warnings and notifications that are 
> easy to fix.
> I'm going to pick up several types of those issues (only on master / 1.9.0 !)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1187) Dart codegen + JSON encoding/decoding

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1187:
--
Component/s: misc

> Dart codegen + JSON encoding/decoding
> -
>
> Key: AVRO-1187
> URL: https://issues.apache.org/jira/browse/AVRO-1187
> Project: Apache Avro
>  Issue Type: Wish
>  Components: misc
>Reporter: Quinn Slack
>Priority: Major
>
> It would be nice to have [Dart|http://www.dartlang.org/] codegen and JSON 
> encoding/decoding support.
> There has been some (unfinished) work on protobuf support for Dart: 
> http://code.google.com/p/dart/issues/detail?id=951 
> https://chromiumcodereview.appspot.com/user/Dan%20Rice 
> https://chromiumcodereview.appspot.com/10595002/
> But there are no Dart implementations of Avro, as best I can determine. If 
> anybody is aware of any, or interested in helping create one, please post 
> here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1791) Please delete old releases from mirroring system

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1791:
--
Component/s: misc

> Please delete old releases from mirroring system
> 
>
> Key: AVRO-1791
> URL: https://issues.apache.org/jira/browse/AVRO-1791
> Project: Apache Avro
>  Issue Type: Bug
>  Components: misc
>Affects Versions: 1.7.7
> Environment: https://dist.apache.org/repos/dist/release/avro/
>Reporter: Sebb
>Priority: Major
>
> To reduce the load on the ASF mirrors, projects are required to delete old 
> releases [1]
> Please can you remove all non-current releases?
> i.e. 1.7.7
> Thanks!
> [1] http://www.apache.org/dev/release.html#when-to-archive



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2255) Implement shellcheck automatically on Pull Requests

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2255:
--
Component/s: misc

> Implement shellcheck automatically on Pull Requests
> ---
>
> Key: AVRO-2255
> URL: https://issues.apache.org/jira/browse/AVRO-2255
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: misc
>Reporter: Michael A. Smith
>Priority: Major
>
> In the several PRs to AVRO-2229 I suggested improvements to some of the shell 
> scripts. Many of those improvements were bugs caught by 
> [https://github.com/koalaman/shellcheck.] I think we should implement 
> shellcheck in our automatic checks so that contributors get fast feedback on 
> their shell scripts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1464) Implement Avro serialization in OCaml

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1464:
--
Component/s: misc

> Implement Avro serialization in OCaml
> -
>
> Key: AVRO-1464
> URL: https://issues.apache.org/jira/browse/AVRO-1464
> Project: Apache Avro
>  Issue Type: Wish
>  Components: misc
>Reporter: Jeff Hammerbacher
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1105) Scala API for Avro

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1105:
--
Component/s: misc

> Scala API for Avro
> --
>
> Key: AVRO-1105
> URL: https://issues.apache.org/jira/browse/AVRO-1105
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: misc
>Reporter: Christophe Taton
>Priority: Major
> Attachments: avro-scala.patch
>
>
> Umbrella issue.
> Goal is to provide Scala friendly APIs for Avro records and protocols (RPCs).
> Related project: http://code.google.com/p/avro-scala-compiler-plugin/ looks 
> dead (no change since Sep 2010).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2075) Allow SchemaCompatibility to report possibly lossy conversions

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2075:
--
Component/s: java

> Allow SchemaCompatibility to report possibly lossy conversions
> --
>
> Key: AVRO-2075
> URL: https://issues.apache.org/jira/browse/AVRO-2075
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.7.7, 1.8.2
> Environment: Java
>Reporter: Anders Sundelin
>Assignee: Anders Sundelin
>Priority: Minor
> Attachments: 
> 0001-AVRO-2075-Add-option-to-report-possible-data-loss-in.patch
>
>
> It is stated in the Avro spec that int and long values are promotable to 
> floats and doubles.
> However, numeric promotions to floats are lossy (losing precision), as is 
> long promotion to double.
> It is suggested that the SchemaCompatibility class is updated to be able to 
> flag conversions that have the possibility to be lossy as errors. The 
> attached patch does just that, by adding a new boolean flag (allowDataLoss), 
> preserving backwards compatibility by defaulting this flag to true.
> Testcases illustrating the problem has been added to the unit test class 
> TestReadingWritingDataInEvolvedSchemas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2045) Avro should warn about corrupt EOF files

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2045:
--
Component/s: java

> Avro should warn about corrupt EOF files
> 
>
> Key: AVRO-2045
> URL: https://issues.apache.org/jira/browse/AVRO-2045
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.6
>Reporter: Lars Volker
>Assignee: Nandor Kollar
>Priority: Major
>
> When running queries on truncated files, Impala's Avro scanner issues a 
> warning:
> {noformat}
> WARNINGS: Problem parsing file 
> hdfs://host.company.com:8020/tmp/datagen/some_db/some_table/col1=A/col2=B/col3=D/col4=C/2017-05-18-18-5-9-876-0.avro
>  at 1327214080(EOF)
> Tried to read 64653 bytes but could only read 16549 bytes. This may indicate 
> data file corruption. (file 
> hdfs://host.company.com:8020/tmp/datagen/some_db/some_table/col1=A/col2=B/col3=D/col4=C/2017-05-18-18-5-9-876-0.avro,
>  byte offset: 1327214080)
> {noformat}
> {{avro-tools tojson}} eventually prints the same number of rows that Impala 
> reads, but does not print a warning. Instead it seems to quietly swallow the 
> EOFException.
> I think it should print a warning instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2075) Allow SchemaCompatibility to report possibly lossy conversions

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2075:
--
Environment: (was: Java)

> Allow SchemaCompatibility to report possibly lossy conversions
> --
>
> Key: AVRO-2075
> URL: https://issues.apache.org/jira/browse/AVRO-2075
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.7.7, 1.8.2
>Reporter: Anders Sundelin
>Assignee: Anders Sundelin
>Priority: Minor
> Attachments: 
> 0001-AVRO-2075-Add-option-to-report-possible-data-loss-in.patch
>
>
> It is stated in the Avro spec that int and long values are promotable to 
> floats and doubles.
> However, numeric promotions to floats are lossy (losing precision), as is 
> long promotion to double.
> It is suggested that the SchemaCompatibility class is updated to be able to 
> flag conversions that have the possibility to be lossy as errors. The 
> attached patch does just that, by adding a new boolean flag (allowDataLoss), 
> preserving backwards compatibility by defaulting this flag to true.
> Testcases illustrating the problem has been added to the unit test class 
> TestReadingWritingDataInEvolvedSchemas



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1950) Better Json serialization for Avro decimal logical types?

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1950:
--
Component/s: java

> Better Json serialization for Avro decimal logical types?
> -
>
> Key: AVRO-1950
> URL: https://issues.apache.org/jira/browse/AVRO-1950
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Zoltan Farkas
>Priority: Minor
>
> Currently as I understand decimal logical types are encoded on top of bytes 
> and fixed avro types. This makes them a bit "unnatural" in the json 
> encoding...
> I worked around a hack in my fork to naturally encode them into json 
> decimals. A good starting point to look at is in: 
> https://github.com/zolyfarkas/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/DecimalEncoder.java
>  
> My approach is a bit hacky, so I would be interested in suggestions to have 
> this closer to something we can integrate into avro...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1827) Handling correctly optional fields when converting Protobuf to Avro

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1827:
--
Component/s: java

> Handling correctly optional fields when converting Protobuf to Avro
> ---
>
> Key: AVRO-1827
> URL: https://issues.apache.org/jira/browse/AVRO-1827
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.7.7, 1.8.0
>Reporter: Jakub Kahovec
>Assignee: Karel Fuka
>Priority: Major
> Attachments: AVRO-1827.patch, AVRO-1827.patch, AVRO-1827.patch, 
> AVRO-1827.patch
>
>
> Hello,
> as of the current implementation of converting protobuf files into avro 
> format, protobuf optional fields are being  given default values in the avro 
> schema if not specified explicitly. 
> So for instance when the protobuf field is defined as  
> {quote}
> optional int64 fieldInt64 = 1;
> {quote}
> in the avro schema it appears as
> {quote}
>  "name" : "fieldInt64",
>   "type" : "long",
>   "default" : 0
> {quote}
> The problem with this implementation is that we are losing information about 
> whether the field was present or not in the original protobuf, as when we ask 
> for this field's value in avro we will be given the default value. 
> What I'm proposing instead is that if the field in the protobuf is defined as 
> optional and has no default value then the generated avro schema type will us 
> a union comprising the matching type and null type with default value null. 
> It is going to look like this:
> {quote}
>  "name" : "fieldIn64",
>   "type" : [ "null", "long" ],
>   "default" : null
> {quote}
> I'm aware that is a breaking change but I think that is the proper way how to 
> handle optional fields.
> I've also  created a patch which fixes the conversion
> Jakub 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1726) Add support for appending a variable number of blocks to DataFileWriter

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1726:
--
Component/s: java

> Add support for appending a variable number of blocks to DataFileWriter
> ---
>
> Key: AVRO-1726
> URL: https://issues.apache.org/jira/browse/AVRO-1726
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.7.7
>Reporter: Bryan Bende
>Priority: Minor
>  Labels: starter
> Fix For: 1.9.0
>
> Attachments: AVRO-1726-2.patch, AVRO-1726.patch
>
>
> It would be helpful to have the ability to append a variable number of raw 
> blocks from a DataFileReader to a DataFileWriter, similar to appendAllFrom() 
> but specifying how many blocks to append.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1643) Add non-String maps as a logical type

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1643:
--
Component/s: java

> Add non-String maps as a logical type
> -
>
> Key: AVRO-1643
> URL: https://issues.apache.org/jira/browse/AVRO-1643
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.6, 1.7.7
>Reporter: Sachin Goyal
>Priority: Minor
>
> Other languages might not be able to duplicate the logic in AVRO-680, so a 
> logical type that indicates a non-string map is indeed a map would be great.
> Reference:
> https://github.com/apache/avro/pull/17



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1124) RESTful service for holding schemas

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1124:
--
Component/s: java

> RESTful service for holding schemas
> ---
>
> Key: AVRO-1124
> URL: https://issues.apache.org/jira/browse/AVRO-1124
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: java
>Reporter: Jay Kreps
>Assignee: Jay Kreps
>Priority: Major
> Attachments: AVRO-1124-can-read-with.patch, AVRO-1124-draft.patch, 
> AVRO-1124-validators-preliminary.patch, AVRO-1124.2.patch, AVRO-1124.3.patch, 
> AVRO-1124.4.patch, AVRO-1124.patch, AVRO-1124.patch
>
>
> Motivation: It is nice to be able to pass around data in serialized form but 
> still know the exact schema that was used to serialize it. The overhead of 
> storing the schema with each record is too high unless the individual records 
> are very large. There are workarounds for some common cases: in the case of 
> files a schema can be stored once with a file of many records amortizing the 
> per-record cost, and in the case of RPC the schema can be negotiated ahead of 
> time and used for many requests. For other uses, though it is nice to be able 
> to pass a reference to a given schema using a small id and allow this to be 
> looked up. Since only a small number of schemas are likely to be active for a 
> given data source, these can easily be cached, so the number of remote 
> lookups is very small (one per active schema version).
> Basically this would consist of two things:
> 1. A simple REST service that stores and retrieves schemas
> 2. Some helper java code for fetching and caching schemas for people using 
> the registry
> We have used something like this at LinkedIn for a few years now, and it 
> would be nice to standardize this facility to be able to build up common 
> tooling around it. This proposal will be based on what we have, but we can 
> change it as ideas come up.
> The facilities this provides are super simple, basically you can register a 
> schema which gives back a unique id for it or you can query for a schema. 
> There is almost no code, and nothing very complex. The contract is that 
> before emitting/storing a record you must first publish its schema to the 
> registry or know that it has already been published (by checking your cache 
> of published schemas). When reading you check your cache and if you don't 
> find the id/schema pair there you query the registry to look it up. I will 
> explain some of the nuances in more detail below. 
> An added benefit of such a repository is that it makes a few other things 
> possible:
> 1. A graphical browser of the various data types that are currently used and 
> all their previous forms.
> 2. Automatic enforcement of compatibility rules. Data is always compatible in 
> the sense that the reader will always deserialize it (since they are using 
> the same schema as the writer) but this does not mean it is compatible with 
> the expectations of the reader. For example if an int field is changed to a 
> string that will almost certainly break anyone relying on that field. This 
> definition of compatibility can differ for different use cases and should 
> likely be pluggable.
> Here is a description of one of our uses of this facility at LinkedIn. We use 
> this to retain a schema with "log" data end-to-end from the producing app to 
> various real-time consumers as well as a set of resulting AvroFile in Hadoop. 
> This schema metadata can then be used to auto-create hive tables (or add new 
> fields to existing tables), or inferring pig fields, all without manual 
> intervention. One important definition of compatibility that is nice to 
> enforce is compatibility with historical data for a given "table". Log data 
> is usually loaded in an append-only manner, so if someone changes an int 
> field in a particular data set to be a string, tools like pig or hive that 
> expect static columns will be unusable. Even using plain-vanilla map/reduce 
> processing data where columns and types change willy nilly is painful. 
> However the person emitting this kind of data may not know all the details of 
> compatible schema evolution. We use the schema repository to validate that 
> any change made to a schema don't violate the compatibility model, and reject 
> the update if it does. We do this check both at run time, and also as part of 
> the ant task that generates specific record code (as an early warning). 
> Some details to consider:
> Deployment
> This can just be programmed against the servlet API and deploy as a standard 
> war. You have lots of instances and load balance traffic over them.
> Persistence
> The storage needs are not very heavy. The clients are expected to cache the 
> id=>schema mapping, and the server can 

[jira] [Updated] (AVRO-2087) Allow specifying default values for logical types in human-readable form

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2087:
--
Component/s: spec

> Allow specifying default values for logical types in human-readable form
> 
>
> Key: AVRO-2087
> URL: https://issues.apache.org/jira/browse/AVRO-2087
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: spec
>Reporter: Zoltan Ivanfi
>Priority: Major
>
> Currently default values for logical types have to be specified as the binary 
> representation of the backing primary type.
> For example, if one wanted to specify 0.00 as the default value for a decimal 
> field, "\u" has to be specified as the default value. If the user tries 
> to specify "0.00", like in AVRO-2086, it is silently accepted but results in 
> unexpected behaviour. This value is not parsed and interpreted as a decimal 
> number but is taken to be the byte representation, i.e. the corresponding 
> hexadecimal ASCII byte sequence 30 2E 30 30 = 80860 with a precision of 2 
> results in a default decimal value of 808.60.
> To set the default value to an arbitrary non-zero value, e.g., 31.80, one has 
> to multiply it by 10^2=100 for a precision of 2, resulting in 3180, which is 
> 0x0C6C when converted to hex. This means that "\u000C\u006C" has to be 
> specified as the default value. Having to do these calculations by hand is 
> not too user (programmer) friendly.
> For a date or timestamp type, the default value has to be specified as a 
> number and not as a string, so an unexpected default value can not be set 
> accidentally in this case. However, one can't use a human-readable 
> representation in this case either, the number of days or seconds 
> (respectively) elapsed since the epoch must be specified, e.g., 1507216329 
> for the current timestamp.
> The first step towards solving this problem will be coming up with a 
> suggested solution. Once we have that, the JIRA description should be 
> extended with details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2078) Avro does not enforce schema resolution rules for Decimal type

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2078:
--
Component/s: java

> Avro does not enforce schema resolution rules for Decimal type
> --
>
> Key: AVRO-2078
> URL: https://issues.apache.org/jira/browse/AVRO-2078
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Anthony Hsu
>Assignee: Nandor Kollar
>Priority: Major
> Attachments: dec.avro
>
>
> According to http://avro.apache.org/docs/1.8.2/spec.html#Decimal
> bq. For the purposes of schema resolution, two schemas that are {{decimal}} 
> logical types _match_ if their scales and precisions match.
> This is not enforced.
> I wrote a file with (precision 5, scale 2) and tried to read it with a reader 
> schema with (precision 3, scale 1). I expected an AvroTypeException to be 
> thrown, but none was thrown.
> Test data file attached. The code to read it is:
> {noformat:title=ReadDecimal.java}
> import java.io.File;
> import org.apache.avro.Schema;
> import org.apache.avro.file.DataFileReader;
> import org.apache.avro.generic.GenericDatumReader;
> import org.apache.avro.generic.GenericRecord;
> import org.apache.avro.io.DatumReader;
> public class ReadDecimal {
>   public static void main(String[] args) throws Exception {
> Schema schema = new Schema.Parser().parse("{\n" + "  \"type\" : 
> \"record\",\n" + "  \"name\" : \"some_schema\",\n"
> + "  \"namespace\" : \"com.howdy\",\n" + "  \"fields\" : [ {\n" + "   
>  \"name\" : \"name\",\n"
> + "\"type\" : \"string\"\n" + "  }, {\n" + "\"name\" : 
> \"value\",\n" + "\"type\" : {\n"
> + "  \"type\" : \"bytes\",\n" + "  \"logicalType\" : 
> \"decimal\",\n" + "  \"precision\" : 3,\n"
> + "  \"scale\" : 1\n" + "}\n" + "  } ]\n" + "}");
> DatumReader datumReader = new GenericDatumReader<>(schema);
> // dec.avro has precision 5, scale 2
> DataFileReader dataFileReader = new DataFileReader<>(
> new File("/tmp/dec.avro"), datumReader);
> GenericRecord foo = null;
> while (dataFileReader.hasNext()) {
>   foo = dataFileReader.next(foo);  // AvroTypeException expected due to 
> change in scale/precision but none occurs
> }
>   }
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2047) NettyTransceiver can NPE when getRemoteName() is called

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2047:
--
Component/s: java

> NettyTransceiver can NPE when getRemoteName() is called
> ---
>
> Key: AVRO-2047
> URL: https://issues.apache.org/jira/browse/AVRO-2047
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
>Reporter: Clement Pang
>Priority: Major
>
> NettyTransceiver can NPE if the channel is closed while a request is 
> underway. The correct thing to do seems to be to check for null and throw an 
> IOException ("underlying transport no longer available").
> {code}
> ! java.lang.NullPointerException: null
> ! at 
> org.apache.avro.ipc.NettyTransceiver.getRemoteName(NettyTransceiver.java:431)
> ! at org.apache.avro.ipc.Requestor.writeHandshake(Requestor.java:202)
> ! at org.apache.avro.ipc.Requestor.access$300(Requestor.java:52)
> ! at org.apache.avro.ipc.Requestor$Request.getBytes(Requestor.java:478)
> ! at org.apache.avro.ipc.Requestor.request(Requestor.java:181)
> ! at org.apache.avro.ipc.Requestor.request(Requestor.java:129)
> ! at 
> org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:84)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2099) Decimal precision is ignored

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2099:
--
Component/s: spec

> Decimal precision is ignored
> 
>
> Key: AVRO-2099
> URL: https://issues.apache.org/jira/browse/AVRO-2099
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: spec
>Reporter: Kornel Kiełczewski
>Priority: Major
>
> According to the documentation 
> https://avro.apache.org/docs/1.8.1/spec.html#Decimal 
> {quote}
> The decimal logical type represents an arbitrary-precision signed decimal 
> number of the form unscaled × 10-scale.
> {quote}
> Then in the schema we might have an entry like:
> {code}
> {
>   "type": "bytes",
>   "logicalType": "decimal",
>   "precision": 4,
>   "scale": 2
> }
> {code}
> However, in the java deserialization I see that the precision is ignored:
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/Conversions.java#L79
> {code}
> @Override
> public BigDecimal fromBytes(ByteBuffer value, Schema schema, LogicalType 
> type) {
>   int scale = ((LogicalTypes.Decimal) type).getScale();
>   // always copy the bytes out because BigInteger has no offset/length 
> ctor
>   byte[] bytes = new byte[value.remaining()];
>   value.get(bytes);
>   return new BigDecimal(new BigInteger(bytes), scale);
> }
> {code}
> The logical type definition in the java api requires the precision to be set:
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/LogicalTypes.java#L116
> {code}
>   /** Create a Decimal LogicalType with the given precision and scale */
>   public static Decimal decimal(int precision, int scale) {
> return new Decimal(precision, scale);
>   }
> {code}
> Is this a feature, that we allow arbitrary precision? If so, why do we have 
> the precision in the API and schema, if it's ignored?
> Maybe that's some java specific issue?
> Thanks for any hints.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1763) Avro Schema Generator to handle polymorphic types

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1763:
--
Component/s: spec

> Avro Schema Generator to handle polymorphic types
> -
>
> Key: AVRO-1763
> URL: https://issues.apache.org/jira/browse/AVRO-1763
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: spec
>Affects Versions: 1.9.0
>Reporter: Qiangqiang Shi
>Priority: Major
>
> Inheritance and polymorphism are widely used in Java libraries. If multiple 
> sophisticated Avro schema generator can be added to Avro, users can generate 
> Avro schema easily for classes in complex context, third-party code and 
> legacy code.
> For example, for the following class:
> {code:java}
> public class TestReflectPolymorphismData {
> {
> public static class SuperclassA1 {
> private String SuperclassA1;
>   }
>   public static class SubclassA1 extends SuperclassA1 {
> private String SubclassA1;
>   }
>   public static class SubclassA2 extends SuperclassA1 {
> private String SubclassA2;
>   }
> public static class SuperB1 {
> private SubclassA1 SubclassA1;
> private List SubclassA2List;
> private Map stringSuperclassA1Map;
> private Map integerSuperclassA1Map;
>   }
> }
> }
> {code}
> It'll be good if Avro can provide a schema generator to generate a schema 
> like the following automatically for class SuperB1 :
> {code:java}
> {
>   "type": "record",
>   "name": "SuperB1",
>   "namespace": "org.apache.avro.reflect.TestReflectPolymorphismData$",
>   "fields": [
> {
>   "name": "SubclassA1",
>   "type": {
> "type": "record",
> "name": "SuperclassA1",
> "fields": [
>   {
> "name": "SuperclassA1",
> "type": "string"
>   },
>   {
> "name": "SuperclassA1Subclasses",
> "type": [
>   "null",
>   {
> "type": "record",
> "name": "SubclassA1",
> "fields": [
>   {
> "name": "SubclassA1",
> "type": "string"
>   }
> ]
>   },
>   {
> "type": "record",
> "name": "SubclassA2",
> "fields": [
>   {
> "name": "SubclassA2",
> "type": "string"
>   }
> ]
>   }
> ]
>   }
> ]
>   }
> },
> {
>   "name": "SubclassA2List",
>   "type": {
> "type": "array",
> "items": "SuperclassA1",
> "java-class": "java.util.List"
>   }
> },
> {
>   "name": "stringSuperclassA1Map",
>   "type": {
> "type": "map",
> "values": "SuperclassA1"
>   }
> },
> {
>   "name": "integerSuperclassA1Map",
>   "type": {
> "type": "array",
> "items": {
>   "type": "record",
>   "name": "Pair34255fab6d3d79ff",
>   "namespace": "org.apache.avro.reflect",
>   "fields": [
> {
>   "name": "key",
>   "type": "int"
> },
> {
>   "name": "value",
>   "type": 
> "org.apache.avro.reflect.TestReflectPolymorphismData$.SuperclassA1"
> }
>   ]
> },
> "java-class": "java.util.Map"
>   }
> }
>   ]
> }
> {code}
> related story: AVRO-1568



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1752) Aliases for enum symbols.

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1752:
--
Component/s: spec

> Aliases for enum symbols.
> -
>
> Key: AVRO-1752
> URL: https://issues.apache.org/jira/browse/AVRO-1752
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: spec
>Reporter: Zoltan Farkas
>Priority: Minor
>
> Currently named types and fields might have aliases acording to the spec.
> It would be great if enum symbols could have aliases as well...
> This would be useful to compatibly fix misspellings...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1934) Avro test resources reference old avro dev versions

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1934:
--
Component/s: java

> Avro test resources reference old avro dev versions
> ---
>
> Key: AVRO-1934
> URL: https://issues.apache.org/jira/browse/AVRO-1934
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Zoltan Farkas
>Priority: Minor
>
> For example:
> https://github.com/apache/avro/blob/master/lang/java/maven-plugin/src/test/resources/unit/idl/pom.xml
>  
> references 1.7.3-SNAPSHOT:
> {code}
>   
> avro-parent
> org.apache.avro
> 1.7.3-SNAPSHOT
> ../../../../../../../../../
>   
> {code}
> this does not seem right.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1911) for avro HTTP content type instead of avro/binary, application/octet-stream;fmt=avro might be more appropriate?

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1911:
--
Component/s: java

> for avro HTTP content type instead of avro/binary, 
> application/octet-stream;fmt=avro might be more appropriate?
> ---
>
> Key: AVRO-1911
> URL: https://issues.apache.org/jira/browse/AVRO-1911
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Zoltan Farkas
>Priority: Major
>
> the content type is defined in:
> {code}
> /** An HTTP-based {@link Transceiver} implementation. */
> public class HttpTransceiver extends Transceiver {
>   static final String CONTENT_TYPE = "avro/binary";
> {code}
> I suggest using for avro binary:
> application/octet-stream;fmt=avro
> and for avro json:
> application/json;fmt=avro
> this would take advantage of standard mime types...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1850) Align JSON and binary record serialization

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1850:
--
Component/s: spec

> Align JSON and binary record serialization
> --
>
> Key: AVRO-1850
> URL: https://issues.apache.org/jira/browse/AVRO-1850
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: spec
>Reporter: David Kay
>Priority: Major
>  Labels: Encoding, Record
>
> The documentation describes the encoding of Avro records as:
> bq.Binary: A record is encoded by encoding the values of its fields in the 
> order that they are declared. In other words, a record is encoded as just the 
> concatenation of the encodings of its fields. Field values are encoded per 
> their schema.
> bq.JSON: Except for unions, the JSON encoding is the same as is used to 
> encode field default values.
> The _field default values_ table says that records and maps are both encoded 
> as JSON type _object_.
> *Enhancement:*
> There is currently no way to write an Avro schema describing a JSON array of 
> positional parameters (i.e. an array containing variables of possibly 
> different type).  An Avro record is the datatype representing an ordered 
> collection of values.  For consistency with the binary encoding, and to allow 
> Avro to represent a schema for JSON tuples, encoding should say:
> bq.JSON: Except for unions and records, the JSON encoding is the same as is 
> used to encode field default values.  A record is encoded as an array by 
> encoding the values of its fields in the order that they are declared.
> For the example schema:
> {noformat}
> {"namespace": "example.avro",
>  "type": "record",
>  "name": "User",
>  "fields": [
>  {"name": "name", "type": "string"},
>  {"name": "favorite_number",  "type": ["int", "null"]},
>  {"name": "favorite_color", "type": ["string", "null"]}
>  ]
> }
> {noformat}
> the JSON encoding currently converts an Avro record to an Avro map (JSON 
> object):
> {noformat}
> {   "name": "Joe",
>  "favorite_number": 42,
>   "favorite_color": null  }
> {noformat}
> Instead Avro records should be encoded in JSON in the same manner as they are 
> encoded in binary, as a JSON array containing the fields in the order they 
> are defined:
> {noformat}
> ["Joe", 42, null]
> {noformat}
> The set of JSON texts validated by the example Avro schema and by the 
> corresponding JSON schema should be equal:
> {noformat}
> {
>   "$schema": "http://json-schema.org/draft-04/schema#;,
>   "type": "array",
>   "name": "User",
>   "items": [
> {"name":"name", "type": "string"},
> {"name":"favorite_number", "type":["integer","null"]},
> {"name":"favorite_color", "type":["string","null"]}
>   ]
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1768) stdin support for getschema

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1768:
--
Component/s: java

> stdin support for getschema
> ---
>
> Key: AVRO-1768
> URL: https://issues.apache.org/jira/browse/AVRO-1768
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.7.7
>Reporter: Bennie Schut
>Priority: Minor
>
> It would be nice to support reading from stdin on getschema calls so you 
> don't need a local file first.
> Somewhat similar to AVRO-1583.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1717) [Github] Support for optional fields when converting json to avro

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1717:
--
Component/s: java

> [Github] Support for optional fields when converting json to avro
> -
>
> Key: AVRO-1717
> URL: https://issues.apache.org/jira/browse/AVRO-1717
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Reporter: Bartosz Wojtkiewicz
>Priority: Major
>
> Currently there is an issue when we want to convert json object to avro using 
> schema that allows optional fields (fields of type 'null'). When json object 
> does not explicitly have such field with 'null' value then is treated as not 
> conforming to schema. I added few test cases that illustrate this problem.
> PR: https://github.com/apache/avro/pull/47



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1714) Nullable Named Schema definition in IDL fails.

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1714:
--
Component/s: java

> Nullable Named Schema definition in IDL fails.
> --
>
> Key: AVRO-1714
> URL: https://issues.apache.org/jira/browse/AVRO-1714
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.6
>Reporter: Mark Perris
>Priority: Major
>
> According to Section 7.2 of the Avro IDL, named schemata may be treated as 
> primitive types.
> As such, I believe it should be possible to create a nullable schema 
> reference:
> {code}
>record coordinate {
>   string type;
>}
>record tweet {
>   union {null, coordinate} coordinate;
>}
> {code}
> to accommodate
> {code}
> {  
>"coordinate":{  
>   "type":"point"
>}
> }
> {code} and {code}
> {  
>"coordinate":null
> }
> {code}
> however, any attempt to store data against that schema results in
> {code}
> Exception in thread "main" org.apache.avro.AvroTypeException: Unknown union 
> branch type
>   at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:445)
>   at 
> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:290)
>   at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>   at 
> org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:267)
>   at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:155)
>   at 
> org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193)
>   at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183)
>   at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
>   at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
>   at org.apache.avro.tool.DataFileWriteTool.run(DataFileWriteTool.java:99)
>   at org.apache.avro.tool.Main.run(Main.java:85)
>   at org.apache.avro.tool.Main.main(Main.java:74)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1665) Provide way to represent BYTES type using base64 encoding in JSON

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1665:
--
Component/s: spec

> Provide way to represent BYTES type using base64 encoding in JSON
> -
>
> Key: AVRO-1665
> URL: https://issues.apache.org/jira/browse/AVRO-1665
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: spec
>Affects Versions: 1.7.7
>Reporter: Konstantin Shaposhnikov
>Priority: Major
>
> Currently JsonEncoder and JsonDecoder represent BYTES type as String encoded 
> using ISO-8859-1 charset.
> It would be good to provide option to use base64 encoding (e.g. using jackson 
> JsonGenerator.writeBinary(byte[] data, int offset, int len) method).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1631) Support for field long names

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1631:
--
Component/s: spec

> Support for field long names
> 
>
> Key: AVRO-1631
> URL: https://issues.apache.org/jira/browse/AVRO-1631
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: spec
>Reporter: Nikoleta Verbeck
>Priority: Minor
>
> It would be of benefit to allow for a way to define a different aliases to 
> reference a field by then just its name value. 
> The use case for this would be when you have a defined spec for communicating 
> between two services, and within this spec fields use short names like bId. 
> But within code you would like to reference that field as a longer, more 
> descriptive form. Example; setBidderId/getBidderId vs setBId/getBId.
> Aliases somewhat solve this but only from a one sided approach (Read or 
> Write) not a bidirectional (Read and Write). The only way to make aliases 
> work in a bidirectional way would be to define two records of the same field 
> set but with the field name and alias values swapped. Basically creating 1 
> record for reading data and the other for writing data.
> One option to improve this would be to expose all field aliases as getters 
> and setters. Another would be to add another attribute to the field def such 
> as 'as' or 'knownAs'. 
> Example of option two:
> {code:title=Option2.avsc}
> {
> "namespace":"options",
> "type":"record",
> "name":"Bidder",
> "fields":[
> {"name":"bId", "as":"bidderId", "value":"string"}
> ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2148) Avro: Schema compatibility/evolution - attribute size change

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2148:
--
Component/s: spec

> Avro: Schema compatibility/evolution - attribute size change
> 
>
> Key: AVRO-2148
> URL: https://issues.apache.org/jira/browse/AVRO-2148
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: spec
>Affects Versions: 1.8.2
>Reporter: Ashok
>Priority: Critical
>  Labels: patch
>
> Let's assume the schema changed from V1 to V2.  Currently, we can't create a 
> merge of the two schema that makes it a compatible for both.  Depending on 
> what size we keep for the specific attribute in the reader schema (the 
> merged), you can read avro files of schema V1 or V2, but not both.  If we 
> keep the higher attribute size value (64) as part of the merged schema, it 
> should allow the read of avro files with lower attribute size value (16)
>  
>  * *V1 schema:*
> { "name": "sid", "type": [ "null",
> { "type": "fixed", "name": "SID", "namespace": "com.int.datatype", "doc": "", 
> "size": *64* }
> ], "doc": "", "default": null, "businessLogic": "" }
>  
>  * *V2 schema:*
> { "name": "sid", "type": [ "null",
> { "type": "fixed", "name": "SID", "namespace": "com.int.datatype", "doc": "", 
> "size": *16* }
> ], "doc": "", "default": null, "businessLogic": "" }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1612) typo in documentation for "fixed" type

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1612:
--
Component/s: spec

> typo in documentation for "fixed" type
> --
>
> Key: AVRO-1612
> URL: https://issues.apache.org/jira/browse/AVRO-1612
> Project: Apache Avro
>  Issue Type: Bug
>  Components: spec
>Reporter: Peter Amstutz
>Priority: Minor
>
> There appears to be a cut-and-paste error in the documentation for the 
> "Fixed" type.  The "namespace" and "aliases" fields probably shouldn't be 
> there.
> Text of the current online documentation  (1.7.7):
> https://avro.apache.org/docs/current/spec.html#Fixed
> Fixed
> Fixed uses the type name "fixed" and supports two attributes:
> name: a string naming this fixed (required).
> namespace, a string that qualifies the name;
> aliases: a JSON array of strings, providing alternate names for this enum 
> (optional).
> size: an integer, specifying the number of bytes per value (required).
> For example, 16-byte quantity may be declared with:
> {"type": "fixed", "size": 16, "name": "md5"}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1571) Support parameterized types in Avro

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1571:
--
Component/s: java

> Support parameterized types in Avro
> ---
>
> Key: AVRO-1571
> URL: https://issues.apache.org/jira/browse/AVRO-1571
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.6, 1.7.7, 1.8.1
>Reporter: Sachin Goyal
>Priority: Major
> Attachments: ParameterizedTypesTest.java
>
>
> The below code cannot be serialized by Avro.
> {code}
> class Leaf  {
>   P p;
>   Q q;
> }
> class Root {
>   Middle1 m1;
>   Middle2 m2;
>   Middle3 m3;
> }
> class Middle1 {
>   Leaf  foo;
> }
> class Middle2 {
>   Leaf  foo;
> }
> class Middle3  {
>   Leaf  foo;
> }
> {code}
> This is because when generating the schema, only the current class is used to 
> generate the schema.
> The parent class' context is missing in ReflectData#createSchema() functions 
> where the actual type-information is actually present.
> Please see the attached test too for a simpler case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2150) Improved idl syntax support for "marker properties"

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2150:
--
Component/s: spec

> Improved idl syntax support for "marker properties"
> ---
>
> Key: AVRO-2150
> URL: https://issues.apache.org/jira/browse/AVRO-2150
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: spec
>Reporter: Zoltan Farkas
>Priority: Minor
>
> It would be nice to allow in IDL "marker properties" like:
> {code}
> @MarkerProperty
> record TestRecord {
> 
> }
> {code}
> this would be only a simpler syntax for:
> {code}
> @MarkerProperty("")
> record TestRecord {
> 
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1122) Java: Avro RPC Requestor can block during handshake in async mode

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1122:
--
Component/s: java

> Java: Avro RPC Requestor can block during handshake in async mode
> -
>
> Key: AVRO-1122
> URL: https://issues.apache.org/jira/browse/AVRO-1122
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.6.3
>Reporter: Mike Percy
>Priority: Major
> Attachments: Screen Shot 2012-06-27 at 12.43.32 AM.png
>
>
> We are seeing an issue in Flume where the Avro RPC Requestor is blocking for 
> long periods of time waiting for the Avro handshake to complete. Since we are 
> using the API with Futures, this should not block.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1562) Add support for types extending Maps/Collections

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1562:
--
Component/s: java

> Add support for types extending Maps/Collections
> 
>
> Key: AVRO-1562
> URL: https://issues.apache.org/jira/browse/AVRO-1562
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.6
>Reporter: Sachin Goyal
>Priority: Major
> Attachments: custom_map_and_collections1.patch
>
>
> Consider the following code:
> {code}
> import java.io.ByteArrayOutputStream;
> import java.util.*;
> import org.apache.avro.Schema;
> import org.apache.avro.file.DataFileWriter;
> import org.apache.avro.reflect.ReflectData;
> import org.apache.avro.reflect.ReflectDatumWriter;
> public class AvroDerivingMaps
> {
> public static void main (String [] args) throws Exception
> {
> MapDerivedContainer orig = new MapDerivedContainer();
> ReflectData rdata = ReflectData.AllowNull.get();
> Schema schema = rdata.getSchema(MapDerivedContainer.class);
> System.out.println(schema);
> 
> ReflectDatumWriter datumWriter = new 
> ReflectDatumWriter (MapDerivedContainer.class, rdata);
> DataFileWriter fileWriter = new 
> DataFileWriter (datumWriter);
> ByteArrayOutputStream baos = new ByteArrayOutputStream();
> fileWriter.create(schema, baos);
> fileWriter.append(orig);
> fileWriter.close();
> }
> }
> class MapDerived extends HashMap
> {
> Integer a = 1;
> String b = "b";
> }
> class MapDerivedContainer
> {
> MapDerived2 map = new MapDerived2();
> }
> class MapDerived2 extends MapDerived
> {
> String c = "c";
> }
> {code}
> \\
> \\
> It throws the following exception:
> {code:javascript}
> {"type":"record","name":"MapDerivedContainer","namespace":"avro","fields":[{"name":"map","type":["null",{"type":"record","name":"MapDerived2","fields":[{"name":"c","type":["null","string"],"default":null},{"name":"a","type":["null","int"],"default":null},{"name":"b","type":["null","string"],"default":null}]}],"default":null}]}
> {code}
> {color:brown}
> Exception in thread "main" 
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: 
> Caused by: org.apache.avro.UnresolvedUnionException: Not in union 
> ["null",{"type":"record","name":"MapDerived2","namespace":"avro","fields":[{"name":"c","type":["null","string"],"default":null},{"name":"a","type":["null","int"],"default":null},{"name":"b","type":["null","string"],"default":null}]}]:
>  {}
>   at 
> org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:600)
>   at 
> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:151)
>   at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>   at 
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:145)
>   at 
> org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:114)
>   at 
> org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:203)
>   at 
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104)
>   at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>   at 
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:145)
>   at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>   at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:290)
>   ... 1 more
> {color}
> \\
> \\
> It appears that ReflectData#createSchema() checks for "type instanceof 
> ParameterizedType" and because of this, it skips handling of the map.
> The same is not true of GenericData#isMap() and GenericData#resolveUnion() 
> fails because of this.
> The same may be true for classes extending ArrayList, Collection, Set etc.
> Also, note the schema for the class extending Map:
> {code:javascript}
> {  
>"type":"record",
>"name":"MapDerived2",
>"fields":[  
>   {  
>  "name":"c",
>  "type":[  
> "null",
> "string"
>  ],
>  "default":null
>   },
>   {  
>  "name":"a",
>  "type":[  
> "null",
> "int"
>  ],
>  "default":null
>   },
>   {  
>  "name":"b",
>  "type":[  
> "null",
> "string"
>  ],
>  "default":null
>   }
>]
> }
> {code}
> This schema ignores the Map completely.
> Probably, for such a class, the schema should look like:
> {code:javascript}
> {
>"type":"record",
>"name":"MapDerived2",
>"fields":[  
>   {  
>   

[jira] [Updated] (AVRO-1570) ReflectData.AllowNull fails with polymorphism and @Union annotation

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1570:
--
Component/s: java

> ReflectData.AllowNull fails with polymorphism and @Union annotation
> ---
>
> Key: AVRO-1570
> URL: https://issues.apache.org/jira/browse/AVRO-1570
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.6
>Reporter: Sachin Goyal
>Priority: Major
>
> Nested union exception is thrown if the following structure is serialized 
> with ReflectData.AllowNull
> (Plain ReflectData works fine)
> {code}
> @Union({Derived.class})
> class Base 
> {
>Integer a = 5;
> }
> class Derived extends Base
> {
> String b = "Foo";
> }
> class PolymorphicDO
> {
>Base obj = new Derived();
> }
> // Serialization code:
> ReflectData rdata = ReflectData.AllowNull.get();
> Schema schema = rdata.getSchema(PolymorphicDO.class);
> ReflectDatumWriter datumWriter = new ReflectDatumWriter 
> (PolymorphicDO.class, rdata);
> DataFileWriter fileWriter = new DataFileWriter (datumWriter);
> fileWriter.create(schema, new ByteArrayOutputStream());
> fileWriter.append(new PolymorphicDO());
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1371) Support of data encryption for Avro file

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1371:
--
Component/s: spec

> Support of data encryption for Avro file
> 
>
> Key: AVRO-1371
> URL: https://issues.apache.org/jira/browse/AVRO-1371
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: spec
>Affects Versions: 1.8.0
>Reporter: Haifeng Chen
>Priority: Major
>  Labels: Rhino
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Avro file format is widely used in Hadoop. As data security is getting more 
> and more attention in Hadoop community, we propose to improve Avro file 
> format to be able to handle data encryption and decryption.
> Similar to compression and decompression, encryption and decryption can be 
> implemented with Codecs, a concept that already exists in Avro. However, Avro 
> Codec context handling needs to be extended to support per-codec contexts, 
> such as encryption keys, for encryption and decryption.
> Avro supports multiple language implementations. This is an umbrella JIRA for 
> this work and the implementation work for each language will be addressed in 
> sub tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1429) Exception on storing Null value through AvroStorage using PIG

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1429:
--
Component/s: java

> Exception on storing Null value through AvroStorage using PIG
> -
>
> Key: AVRO-1429
> URL: https://issues.apache.org/jira/browse/AVRO-1429
> Project: Apache Avro
>  Issue Type: Task
>  Components: java
> Environment: Hadoop 0.20.2-cdh3u5
> Apache Pig version 0.8.1-cdh3u5
> java version "1.6.0_27"
>Reporter: Sudhir Ranjan
>Priority: Major
>  Labels: features, patch
>
> Getting exception on storing null valued record/tupple as avro.
> The input file having one column with long values (one of them is null means 
> nothing) and when I am trying to store the data in avro format ,it throws 
> error.
> Please suggest if I am missing any thing some where as per the bellow 
> codebase or else please provide the patch.
> **My code base.
> REGISTER 
> /home/hadoop/work/sudhir/AvroAnalysis/Avrojars/snappy-java-1.0.4.1.jar
> REGISTER /home/hadoop/work/sudhir/AvroAnalysis/Avrojars/avro-1.7.5.jar
> REGISTER /home/hadoop/work/sudhir/AvroAnalysis/Avrojars/json-simple-1.1.jar;
> REGISTER /home/hadoop/work/sudhir/AvroAnalysis/Avrojars/piggybank.jar;
> REGISTER 
> /home/hadoop/work/sudhir/AvroAnalysis/Avrojars/jackson-core-asl-1.5.5.jar;
> REGISTER 
> /home/hadoop/work/sudhir/AvroAnalysis/Avrojars/jackson-mapper-asl-1.5.5.jar;
> -- The input file only have 1 column (normal TEXT data i.e TSV format) and 
> the file having a null value means nothing
> A = load '/home/hadoop/work/sudhir/AvroAnalysis/input/TSV_uncompressed/part*' 
> using PigStorage('\t') as (USER_ID:long);
> -- The soutput to be stored in avro data format
> STORE A INTO 
> '/home/hadoop/work/sudhir/AvroAnalysis/output/TSV_uncompressed/part*' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('schema','{"namespace":"com.sudhir.schema.users.avro","type":"long","name":"users_avro","doc":"Avro
>  storing with schema using 
> Pig.","fields":[{"name":"USER_ID","type":["null","long"],"default":null}]}');
> ***Getting Error like:
> INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - 100% complete
> ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 2997: Unable to recreate 
> exception from backed error: 
> org.apache.avro.file.DataFileWriter$AppendWriteException: 
> java.lang.NullPointerException: null of long
> ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1354) SortedKeyValueFiles should support appends

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1354:
--
Component/s: java

> SortedKeyValueFiles should support appends
> --
>
> Key: AVRO-1354
> URL: https://issues.apache.org/jira/browse/AVRO-1354
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Dipti Desai
>Priority: Major
>
> SortedKeyValueFiles currently don't allow for appends. This functionality 
> would be a nice to have.
> http://apache-avro.679487.n3.nabble.com/JIRA-to-support-append-for-SortedKeyValueFiles-td4027834.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2250) Release 1.9.0

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2250:
--
Component/s: release

> Release 1.9.0
> -
>
> Key: AVRO-2250
> URL: https://issues.apache.org/jira/browse/AVRO-2250
> Project: Apache Avro
>  Issue Type: Task
>  Components: release
>Reporter: Nandor Kollar
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1226) Non-Avro data causes runtime exceptions/errors when sent to Avro NettyTransceiver

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1226:
--
Component/s: java

> Non-Avro data causes runtime exceptions/errors when sent to Avro 
> NettyTransceiver
> -
>
> Key: AVRO-1226
> URL: https://issues.apache.org/jira/browse/AVRO-1226
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.7.3
>Reporter: Brock Noland
>Priority: Major
>
> AVRO- put in a stop gap measure to stop Avro from throwing an OOMError 
> when something like an HTTP request was sent to an AVRO IPC port. The general 
> issue of port scanning/monitoring causing Avro to throw opaque runtime errors 
> still exists.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1156) Avro responder swallows thrown Errors

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1156:
--
Component/s: java

> Avro responder swallows thrown Errors
> -
>
> Key: AVRO-1156
> URL: https://issues.apache.org/jira/browse/AVRO-1156
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Reporter: Mike Percy
>Priority: Major
> Attachments: AVRO-1156-1.patch
>
>
> The Avro responder wraps caught Errors, such as OutOfMemoryErrors, in 
> Exceptions and rethrows them. That's problematic because an Error should be 
> allowed to crash the JVM, since it's often irrecoverable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1032) Add AvroMapDriver and AvroReduceDriver

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1032:
--
Component/s: java

> Add AvroMapDriver and AvroReduceDriver
> --
>
> Key: AVRO-1032
> URL: https://issues.apache.org/jira/browse/AVRO-1032
> Project: Apache Avro
>  Issue Type: Wish
>  Components: java
>Reporter: Daniel Micol-Ponce
>Priority: Major
>
> I think Avro should include an AvroMapDriver and AvroReduceDriver, similar to 
> Hadoop's MapDriver and ReduceDriver, in order to allow easier unit tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-628) Unions have no getter function (avro_union_get)

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-628:
-
Component/s: c

> Unions have no getter function (avro_union_get)
> ---
>
> Key: AVRO-628
> URL: https://issues.apache.org/jira/browse/AVRO-628
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: c
>Reporter: Gavin M. Roy
>Priority: Major
>
> The union data type has no getter function and as such, the only way to get 
> to the data in the union is to create a struct
> {code}typedef struct avro_union_datum_t {
>   struct avro_obj_t obj;
>   int64_t discriminant;
>   avro_datum_t value;
> } avro_union_datum_t;{code}
> in your own include or code as the struct for this is in datum.h which is not 
> installed with the library.  In addition, there is a type warning when trying 
> to use avro_record_get using the avro_union_datum_t.
> Ideally there would be a getter that exposes the datum in the union.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1009) Use ExecutionHandler by default in NettyServer and/or clarify documentation

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1009:
--
Component/s: java

> Use ExecutionHandler by default in NettyServer and/or clarify documentation
> ---
>
> Key: AVRO-1009
> URL: https://issues.apache.org/jira/browse/AVRO-1009
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.6.2
>Reporter: James Baldassari
>Priority: Major
>  Labels: java
>
> It may be a good idea to use an ExecutionHandler with a cached thread pool by 
> default in NettyServer.  If an ExecutionHandler is not used then, as pointed 
> out in AVRO-976 and AVRO-1001, each Netty session can only execute one RPC at 
> a time.  Users should still be allowed to override the ExecutionHandler with 
> their own implementation.  Whether we make this change or not, I think the 
> documentation in NettyServer should explain in a little more detail the 
> behavior of NettyServer with and without an ExecutionHandler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2159) Naming Limitations of Schemas in Stricter Reference Contexts

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2159:
--
Component/s: spec

> Naming Limitations of Schemas in Stricter Reference Contexts
> 
>
> Key: AVRO-2159
> URL: https://issues.apache.org/jira/browse/AVRO-2159
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: spec
>Reporter: Bridger Howell
>Priority: Major
>
> (Excuse the lengthiness of this ticket description - it was initially written 
> as an email that became too long. Feel free to correct any misguided 
> reasoning.)
> I've come to realize that there are some undesirable constraints on how avro 
> schemas can be used in Java code generation and IDL, that only appear as 
> minor annoyances when you use schemas generically. In particular, I'm focused 
> on cases where it's desirable to use two schemas that have the same name in 
> some context.
>   
>  *Issue:*
>  Suppose I'm writing an application that publishes a many different kinds of 
> data somewhere, with each type of data having its own schema. And then 
> suppose that a some number of those schemas would like to share some kind of 
> common schema, to start with.
>   
>  If I do this, and I happen to be using Java code generation to manage 
> schemas, I'll soon find difficulty in two directions:
>   
>  - I would find it difficult to upgrade the data shared among all of these 
> external schemas by way of the common schema, without upgrading all of those 
> schemas at the same time. The problem here being that neither Java's 
> classpath nor an IDL protocol can support the way avro's name field maps as a 
> class name onto the classpath or a reference name onto a protocol's symbols.
>   
>  The intermediate step of the application being partially migrated between 
> version 1 and version 2 of a common schema has no representation in either of 
> these contexts. Using a different name becomes a very annoying option in many 
> cases, since it is an incompatible change (or with aliases, it's at least not 
> consistently compatible across implementations).
>  - I would find it difficult to migrate away from the external schemas using 
> that shared schema, for the same reasons listed above.
> In IDL (without code generation), these issues can usually be avoided by 
> creating a second protocol, and in generic avro, the issues would be avoided 
> by using a different schema parser or schema builder.
>   
>  *Analysis:*
>  At first glance, it is tempting to blame the name-matching requirement for 
> schema resolution as a culprit - and it may be correct in many cases that 
> requiring schemas have compatible structure is all that is needed.
>   
>  However, the way I see it is that the name-matching requirement for schema 
> resolution is there to ensure that there is _the intent for two schemas to 
> resolve with each other_, and the rest of the checks are just there to make 
> sure that such an intent can be reasonably carried out.
>   
>  The difficulty from either the two examples above happens not because of a 
> lack of pre-determined intent for schemas to resolve, but rather the 
> inability to simultaneously supply a unique reference for each of the 
> schemas, while intending that the correct groups of schemas can resolve.
>   
>  Thus, the way to avoid these issues so far has been to create a new 
> reference context, and the severity of the issue in each case corresponds to 
> the difficulty of creating a new reference context:
>  * For generic schemas, create a new parser or schema builder [easy - minorly 
> annoying]
>  * For IDL, create a new protocol [minorly annoying - somewhat annoying]
>  * For Java code generation, create a new classpath [very annoying (Java 9) - 
> impossible]
> Based on that, I understand a schema's name as expressing two overlapping 
> meanings:
>  - the intent to be able to resolve with other schemas with the same name 
> (let's call this the {{resolveName}})
>  - the ability to be uniquely referenced from some context (let's call this 
> the {{referenceName}})
>  
>  If these two meanings were able to be specified independently, I think that 
> schemas would be much easier to use in contexts where references are more 
> limited.
>   
>  *Speculative Solutions:*
>  Minimally, I think it's reasonable to create at least one new field to 
> separate the meaning of a schema's {{referenceName}} from its 
> {{resolveName}}, and use the old name field to compatibly handle missing 
> values. Then other tools that don't immediately apply schema resolution, can 
> optionally upgrade to support using the {{referenceName}} instead of the 
> {{resolveName}}.
>   
>  Beyond that, having {{name}} continue to mean {{resolveName}} would mean 
> that old avro 

[jira] [Updated] (AVRO-2187) Add RPC Streaming constructs/keywords to Avro IDL or schema

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2187:
--
Component/s: spec

> Add RPC Streaming constructs/keywords to Avro IDL or schema
> ---
>
> Key: AVRO-2187
> URL: https://issues.apache.org/jira/browse/AVRO-2187
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: spec
>Reporter: Srujan Narkedamalli
>Priority: Major
>
> Motivation:
> We recently added support for transporting Avro serialization and IDL over 
> gRPC for Java. In order to use the streaming features of gRPC or any other 
> transport that supports streaming we need to be able to specify them IDL and 
> schema.
> Details:
> Currently, gRPC supports 3 types of streaming calls:
>  # server streaming (server can send multiple responses for a single request)
>  # client streaming (client can multiple requests and server sends a single 
> response)
>  # bi-directional streaming call (on going rpc with multiple requests and 
> responses)
> We would want a way to represent these types on calls in Avro's IDL similar 
> to one-way calls using a keywords. Usually in gRPC with other IDLs a 
> streaming request or response is repeated payload of same type. For client 
> streaming and bi-directional streaming it would be simpler to have a single 
> request argument when representing their type in callbacks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2254) Unions with 2 records declared downward fail

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2254:
--
Component/s: java

> Unions with 2 records declared downward fail
> 
>
> Key: AVRO-2254
> URL: https://issues.apache.org/jira/browse/AVRO-2254
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.9.0
>Reporter: Zoltan Farkas
>Priority: Major
>
> The following IDL will fail complaining that 2 same type is declared twice in 
> the union:
> {code}
> @namespace("org.apache.avro.gen")
> protocol UnionFwd {
> record TestRecord {
>   union {SR1, SR2} unionField;
> }
> record SR1 {
>   string field;
> }
> record SR2 {
>   string field;
> }
> }
> {code}
> the fix for this can be pretty simple:
> https://github.com/zolyfarkas/avro/commit/56b215f73f34cc80d505875c90217916b271abb5



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2205) Add IP address logical type and convertors

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2205:
--
Component/s: spec

> Add IP address logical type and convertors
> --
>
> Key: AVRO-2205
> URL: https://issues.apache.org/jira/browse/AVRO-2205
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: spec
>Reporter: Tristan Stevens
>Priority: Major
>
> IP addresses can be much more optimally represented as a 64 bit integer, 
> meaning that it's much more efficient for storage and allowing consumers to 
> do equality or subnet (range) comparisons using long-integer arithmetic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2273) Release 1.8.3

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2273:
--
Component/s: release

> Release 1.8.3
> -
>
> Key: AVRO-2273
> URL: https://issues.apache.org/jira/browse/AVRO-2273
> Project: Apache Avro
>  Issue Type: Task
>  Components: release
>Reporter: Thiruvalluvan M. G.
>Priority: Major
> Fix For: 1.8.3
>
>
> This ticket is for releasing Avro 1.8.3 and discussing any topics related to 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2093) Extend "custom coders" to fully support union types

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2093:
--
Component/s: java

> Extend "custom coders" to fully support union types
> ---
>
> Key: AVRO-2093
> URL: https://issues.apache.org/jira/browse/AVRO-2093
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Raymie Stata
>Priority: Major
>
> The initial implementation of "custom coders" for SpecificRecord (AVRO-2090) 
> only supports "nullable unions" (two-branch unions where one branch is the 
> null type).  This JIRA extends that implementation to support all forms of 
> unions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1222) unable to install avro 1.7.3, avro-c.pc missing

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1222:
--
Component/s: c

> unable to install avro 1.7.3, avro-c.pc missing
> ---
>
> Key: AVRO-1222
> URL: https://issues.apache.org/jira/browse/AVRO-1222
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c
>Affects Versions: 1.7.3
> Environment: racky# uname -a
> Linux racky 2.6.32.14-127.nuMetra.1.fc12.x86_64 #1 SMP Sat Jun 19 07:08:40 
> PDT 2010 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Alfonso Urdaneta
>Priority: Major
>
> ./cmake_avrolib.sh in 
> http://mirror.nexcess.net/apache/avro/stable/c/avro-c-1.7.3.tar.gz fails 
> during the installation phase.
> Install the project...
> -- Install configuration: "Debug"
> -- Installing: /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro.h
> -- Installing: /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/consumer.h
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/data.h
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/legacy.h
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/msstdint.h
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/refcount.h
> -- Installing: /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/io.h
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/allocation.h
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/platform.h
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/resolver.h
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/value.h
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/basics.h
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/errors.h
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/msinttypes.h
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/schema.h
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/include/avro/generic.h
> -- Installing: /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/lib/libavro.a
> -- Installing: 
> /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/lib/libavro.so.22.0.0
> -- Installing: /home/alfonso/tmp/avro-c-1.7.3/build/avrolib/lib/libavro.so
> CMake Error at src/cmake_install.cmake:65 (FILE):
>   file INSTALL cannot find file
>   "/home/alfonso/tmp/avro-c-1.7.3/build/src/avro-c.pc" to install.
> Call Stack (most recent call first):
>   cmake_install.cmake:37 (INCLUDE)
> make: *** [install] Error 1
> 6.695u 3.161s 0:09.83 100.2%  0+0k 0+65560io 0pf+0w



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2146) getting Expected start-union. Got VALUE_STRING

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2146:
--
Component/s: java

> getting Expected start-union. Got VALUE_STRING
> --
>
> Key: AVRO-2146
> URL: https://issues.apache.org/jira/browse/AVRO-2146
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
> Environment: error message:
> Exception in thread "main" org.apache.avro.AvroTypeException: Expected 
> start-union. Got VALUE_STRING
>  at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:698)
>  at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441)
>  at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:290)
>  at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>  at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:267)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
>  at 
> org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232)
>  at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
>  at 
> org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232)
>  at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
>  at myJson2Avro.fromJasonToAvro(myJson2Avro.java:81)
>  at myJson2Avro.main(myJson2Avro.java:48)
>Reporter: laki
>Priority: Major
>
> Here is the schema, no unions, but getting union error :
>  
> {
>  "type" : "record",
>  "name" : "edm_generic_publisher_avro_schema",
>  "namespace" : "edm.avro",
>  "doc" : "The generic avro schema used by publishers to publish events to the 
> enterprise streaming service",
>  "fields" : [
> {"name" : "event", 
>  "type" : {
>  "type" : "record",
>  "name" : "event_meta_data",
>  "fields" : [
> {"name" : "event_name", "type" : "string", "doc" : "The name of the event. In 
> the CDC, this field is populated with the name of the data base table or 
> segment."}
> ,
> {"name" : "operation_type", "type" : "string", "doc": "The operation or 
> action that triggered the event. e.g., Insert, Update, Delete, etc."}
> ,
> {"name" : "transaction_identifier", "type" : "string", "default" : "NONE", 
> "doc" : "A unique identifier that identifies a unit or work or transaction. 
> Useful in relating multiple events together."}
> ,
> {"name" : "event_publication_timestamp_millis", "type" : "string", "doc": 
> "timestamp when the event was published"}
> ,
>  
> {"name" : "event_publisher", "type" : "string", "doc" : "The system or 
> application that published the event"}
> ,
>  
> {"name" : "event_publisher_identity", "type": "string", "default" : "NONE", 
> "doc": "The identity (user) of the system or application that published the 
> event"}
> ,
>  
> {"name" : "event_timestamp_millis", "type" : "string", "default" : "NONE", 
> "doc": "timestamp when the event occured"}
> ,
>  
> {"name": "event_initiator", "type": "string", "default" : "NONE", "doc" : 
> "The system or application that initiated the event"}
> ,
> {"name": "event_initiator_identity", "type" : "string", "default" : "NONE", 
> "doc": "The system id or application id that initiated the event" }
> ]},
>  "doc" : "The data about the published event"
>  },
>  { "name" : "contents",
>  "type" : {
>  "name": "data_field_groups",
>  "type": "array",
>  "items": {
>  "type": "record",
>  "name": "data_field_group",
>  "fields" : [
> {"name": "data_group_name", "type": "string" }
> ,
>  {
>  "name": "data_fields",
>  "type": {
>  "type": "array",
>  "items": {
>  "name": "data_field",
>  "type": "record",
>  "fields":[
> {"name" : "data_field_name", "type" : "string", "doc" : "The field name"}
> ,
>  
> {"name": "data_field_type", "type": "string", "doc" : "The data type is one 
> of the following values: string, boolean, int, long, float, double or bytes"}
> ,
> {"name" : "data_field_value", "type" : ["string"], "doc" : "The value"}
> ]
>  }
> }
> }
>  ]
>  }
>  },
>  "doc" : "The datafields for the for the published event"
>  } 
>  ]
>  }
> ;
>  
>  
> here is the code that is causing the issue---
>  
> static byte[] fromJasonToAvro( 

[jira] [Updated] (AVRO-1568) Allow Java polymorphism in Avro for third-party code

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1568:
--
Component/s: java

> Allow Java polymorphism in Avro for third-party code
> 
>
> Key: AVRO-1568
> URL: https://issues.apache.org/jira/browse/AVRO-1568
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.7.6
>Reporter: Sachin Goyal
>Priority: Major
>
> A large number of Java designs interacting with databases with 
> Hibernate/Couchbase (perhaps, even otherwise) have Java polymorphism of the 
> form:
> {code:java}
> class Base 
> {
>Integer a = 5;
> }
> class Derived extends Base
> {
> String b = "Foo";
> }
> class PolymorphicDO
> {
>Base b = new Derived();
> }
> {code}
> Jackson handles this kind of field by using annotations such as:
> {code}
> @JsonTypeInfo(use = JsonTypeInfo.Id.CLASS, include = 
> JsonTypeInfo.As.PROPERTY, property = "@class”)
> {code}
> If such a thing can be added to Avro, all those Java designs could become 
> immediately usable with Avro. They would also become Hadoop compatible due to 
> AvroSerde.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2094) Extend "custom coders" to support logical types

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2094:
--
Component/s: java

> Extend "custom coders" to support logical types
> ---
>
> Key: AVRO-2094
> URL: https://issues.apache.org/jira/browse/AVRO-2094
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Raymie Stata
>Priority: Major
>
> The initial implementation of "custom coders" (AVRO-2090) does not support 
> Avro's logical types.  This JIRA extends that implementation to remove this 
> limitation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2116) unknown fields in json not ignored

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2116:
--
Component/s: java

> unknown fields in json not ignored
> --
>
> Key: AVRO-2116
> URL: https://issues.apache.org/jira/browse/AVRO-2116
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.1
> Environment: java 1.8
>Reporter: redlion
>Priority: Major
>
> As the screenshot, I put two unknown field, one field 'unknown1' is under 
> root level, and another field 'unknown2' is under the sub record startRule, 
> when I try to parse this json . I got an error said: 
> Expected Unkown fileds: [unkown2], Got FIELD_NAME.
> !http://images-1254198035.file.myqcloud.com/avro_issue.png|height=350,width=550!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2260) IDL Json Parsing is lossy, and it could be made more accurate.

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2260:
--
Component/s: java

> IDL Json Parsing is lossy, and it could be made more accurate.
> --
>
> Key: AVRO-2260
> URL: https://issues.apache.org/jira/browse/AVRO-2260
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Reporter: Zoltan Farkas
>Priority: Minor
>
> Currently all integers are handled as Long, and all floating point as Double, 
> having basically the following issues:
> 1) cannot handle numbers larger that MAXLONG.
> 2) introducing unnecessary precision 
> {code}
> JsonNode Json() :
> { String s; Token t; JsonNode n; }
> { 
> ( s = JsonString() { n = new TextNode(s); }
> | (t= { n = new LongNode(Long.parseLong(t.image)); })
> | (t= {n=new 
> DoubleNode(Double.parseDouble(t.image));})
> | n=JsonObject()
> | n=JsonArray()
> | ( "true" { n = BooleanNode.TRUE; } )
> | ( "false" { n = BooleanNode.FALSE; } )
> | ( "null" { n = NullNode.instance; } )
>  )
>   { return n; }
> }
> {code}
> This should be improved to:
> {code}
> JsonNode Json() :
> { String s; Token t; JsonNode n; }
> { 
> ( s = JsonString() { n = new TextNode(s); }
> | (t= {
>try {
>  n = new IntNode(Integer.parseInt(t.image));
>} catch(NumberFormatException  e) {
>  try {
> n = new LongNode(Long.parseLong(t.image));
>  } catch(NumberFormatException  ex2) {
> n = new BigIntegerNode(new java.math.BigInteger(t.image));
>  }
>}
>  })
> | (t= {n=new DecimalNode(new 
> java.math.BigDecimal(t.image));})
> | n=JsonObject()
> | n=JsonArray()
> | ( "true" { n = BooleanNode.TRUE; } )
> | ( "false" { n = BooleanNode.FALSE; } )
> | ( "null" { n = NullNode.instance; } )
>  )
>   { return n; }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1273) JavaScript dynamic generation of constructor funcs for Avro records

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1273:
--
Component/s: javascript

> JavaScript dynamic generation of constructor funcs for Avro records
> ---
>
> Key: AVRO-1273
> URL: https://issues.apache.org/jira/browse/AVRO-1273
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: javascript
>Reporter: Quinn Slack
>Priority: Minor
>  Labels: javascript
> Attachments: AVRO-1273.patch
>
>
> Per https://issues.apache.org/jira/browse/AVRO-485, I have extended Avro's 
> JavaScript support to dynamically generate constructors for Avro records.
> Validation of JS objects against Avro schemas is still supported, but the API 
> is different: Avro.validate(schema, obj) instead of 
> Validator.validate(schema, obj). This is a breaking change but may be worth 
> it because there are now several Avro.* funcs.
> Code is at https://github.com/sqs/avro/tree/lang-js/lang/js. I will attach a 
> diff.
> Here is sample usage. We compile a ManyFieldsRecord constructor function 
> using the Avro schema as input. The constructor function accepts a JS object, 
> which it validates against the Avro schema and then uses to populate the new 
> object's fields. Then ManyFieldRecords objects use Object.defineProperty 
> setters to ensure that the object remains valid Avro.
> {code:javascript}
> var manyFieldsRecordSchema = {
>   type: 'record', name: 'ManyFieldsRecord', fields: [
> {name: 'nullField', type: 'null'},
> {name: 'booleanField', type: 'boolean'},
> {name: 'intField', type: 'int'},
> {name: 'longField', type: 'long'},
> {name: 'floatField', type: 'float'},
> {name: 'doubleField', type: 'double'},
> {name: 'stringField', type: 'string'},
> {name: 'bytesField', type: 'bytes'}
>   ]
> };
> var compiledTypes = Avro.compile(manyFieldsRecordSchema)
>   ManyFieldsRecord = compiledTypes.ManyFieldsRecord,
>   mfr = new ManyFieldsRecord();
> test.throws(function() { mfr.nullField = undefined; });
> test.throws(function() { mfr.nullField = 1; });
> test.throws(function() { mfr.booleanField = 'a'; });
> test.throws(function() { mfr.intField = 'a'; }); // TODO: warn if setting 
> int/long field to a non-integer
> test.throws(function() { mfr.longField = 'a'; });
> test.throws(function() { mfr.floatField = 'a'; });
> test.throws(function() { mfr.doubleField = 'a'; });
> test.throws(function() { mfr.stringField = 3; });
> mfr.nullField = null;
> mfr.booleanField = true;
> mfr.intField = 1;
> mfr.longField = 2;
> mfr.floatField = 3.5;
> mfr.doubleField = 4.5;
> mfr.stringField = 'a';
> test.equal(mfr.nullField, null);
> test.equal(mfr.booleanField, true);
> test.equal(mfr.intField, 1);
> test.equal(mfr.longField, 2);
> test.equal(mfr.floatField, 3.5);
> test.equal(mfr.doubleField, 4.5);
> test.equal(mfr.stringField, 'a');
> // Standard JavaScript JSON API interface:
> mgr.toJSON(); // --> returns plain JS object (without Avro-validating setters)
> JSON.stringify(mfr); // --> returns Avro JSON
> {code}
> More examples are in the test dir: 
> https://github.com/sqs/avro/tree/lang-js/lang/js/test.
> This is still rough and I am very interested in getting feedback. Thanks!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1518) Python client support decimal.Decimal types -> double encoding / decoding

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1518:
--
Component/s: python

> Python client support decimal.Decimal types -> double encoding / decoding
> -
>
> Key: AVRO-1518
> URL: https://issues.apache.org/jira/browse/AVRO-1518
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: python
>Reporter: Scott Reynolds
>Assignee: Scott Reynolds
>Priority: Major
> Fix For: 1.7.9
>
>
> Python standard library > 2.4 provides a Decimal type that has much better 
> semantics then standard binary float. Avro library should be able to accept 
> Decimal's and encode them as doubles.
> (https://docs.python.org/2/library/decimal.html)
> I also believe it should, by default, turn Avro double's into Decimal object 
> instead of a float.
> Simple patch allows for encoding a Decimal into an Avro double
> {code}
> --- io.py 2014-05-23 13:41:14.0 -0700
> +++ /Users/sreynolds/Projects/avro-1.7.6 2/src/avro/io.py 2014-05-23 
> 13:44:03.0 -0700
> @@ -46,6 +46,11 @@ try:
>  except ImportError:
>   import simplejson as json
> +try:
> +from decimal import Decimal
> +except ImportError:
> +Decimal = float
> +
>  #
>  # Constants
>  #
> @@ -117,7 +122,7 @@ def validate(expected_schema, datum):
>  and LONG_MIN_VALUE <= datum <= LONG_MAX_VALUE)
>elif schema_type in ['float', 'double']:
>  return (isinstance(datum, int) or isinstance(datum, long)
> -or isinstance(datum, float))
> +or isinstance(datum, float) or instance(datum, Decimal))
>elif schema_type == 'fixed':
>  return isinstance(datum, str) and len(datum) == expected_schema.size
>elif schema_type == 'enum':
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1236) AvroMultipleOutputs fails to close successfuly

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1236:
--
Component/s: java

> AvroMultipleOutputs fails to close successfuly
> --
>
> Key: AVRO-1236
> URL: https://issues.apache.org/jira/browse/AVRO-1236
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.3
>Reporter: Victor Iacoban
>Priority: Major
>
> When I'm using AvroMultipleOutputs my job fails with exception, but works ok 
> if I replace AvroMultipleOutputs with MultipleOutputs:
> 2013-01-29 14:29:04,771 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
>  No lease on 
> /tmp/avros/_temporary/_attempt_201301290714_0012_m_00_0/part-m-0 File 
> is not open for writing. Holder DFSClient_NONMAPREDUCE_-853305103_1 does not 
> have any open files.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2316)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2299)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2095)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:416)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1160)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>   at $Proxy10.addBlock(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:616)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>   at $Proxy10.addBlock(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:290)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1150)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1003)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1479) JavaScript encoder

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1479:
--
Component/s: javascript

> JavaScript encoder
> --
>
> Key: AVRO-1479
> URL: https://issues.apache.org/jira/browse/AVRO-1479
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: javascript
>Reporter: Sal Zsosn
>Priority: Trivial
>
> I've been working on an encoder for JavaScript.
> From the other JavaScript implementations I've seen out there, this one is 
> different in that it supports all Avro types and not only works in node.js 
> but also in the browser. Also the Java implementation is able to parse the 
> resulting file formats.
> I've included examples. It's only able to encode, not decode yet.
> Maybe that's something worth including in future releases.
> http://www.speedyshare.com/CAwTj/avro-encoder.7z



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2084) API changes review for Avro

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2084:
--
Component/s: java

> API changes review for Avro
> ---
>
> Key: AVRO-2084
> URL: https://issues.apache.org/jira/browse/AVRO-2084
> Project: Apache Avro
>  Issue Type: Test
>  Components: java
>Reporter: Andrey Ponomarenko
>Priority: Major
> Attachments: Avro-Report-1.png, Avro-Report-2.png
>
>
> The review of API changes for the Avro library since 1.0.0 version: 
> https://abi-laboratory.pro/java/tracker/timeline/avro/
> The report is updated three times a week. Hope it will be helpful for users 
> and maintainers of the library.
> The report is generated by https://github.com/lvc/japi-tracker
> Thank you.
> !Avro-Report-1.png|API changes review!
> !Avro-Report-2.png|API symbols timeline!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1547) AvroApp Schema Tool

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1547:
--
Component/s: java

> AvroApp Schema Tool
> ---
>
> Key: AVRO-1547
> URL: https://issues.apache.org/jira/browse/AVRO-1547
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: java
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Minor
> Fix For: 1.9.0
>
>
> Over in Gora, I have been thinking for a while that the process of writing 
> JSON data beans is rather time consuming when beans are LARGE.
> I wanted to open this ticket for a while and now only get around to. I 
> proposed to have the following
> A simple HTML webpage that defines a form of sorts, the form will enable 
> users to create JSON schemas and will be driven by enabling users to enter 
> Object values based on the current Avro specification document e.g. ti will 
> be restrictive in scope.
> On top of this I propose to then use simple JQuery to send a request to the 
> JSONBlob API [0], obtain a JSON representation of the data and then pretty 
> print write this information to a file within the browser. The users can then 
> save this file focally and do with it what they wish.
> I think that this page can easily be hosted alongside the current static Avro 
> website and that there is no need to write a web application for this yet.
> I'll try to work on it sooner rather than later as this would also lower the 
> barrier for users of Gora (as I am sure it would Users of other technologies 
> requiring definition of Objects via JSOn schemas).
> I've not assigned this against any component as there is none which I feel 
> appropriate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1899) PascalCase for property names generated by avrogen for C#

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1899:
--
Component/s: csharp

> PascalCase for property names generated by avrogen for C#
> -
>
> Key: AVRO-1899
> URL: https://issues.apache.org/jira/browse/AVRO-1899
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: csharp
>Reporter: Xtra Coder
>Priority: Major
>
> Currently (code in branch 1.8) avrogen generates properties in C# data 
> classes 1:1 as they are defined in shema, what results for field named 
> 'favorite_color' in code like following:
> public string favorite_color {
> get { return this._favorite_color; }
> set { this._favorite_color = value; }
> }
> In general property names should use PascalCasing (see: 
> https://msdn.microsoft.com/en-us/library/ms229043.aspx) and correctly 
> generated code would look like
> public string FavoriteColor {
> get { return this._favorite_color; }
> set { this._favorite_color = value; }
> }
> Potential change is rather minor:
> .\avro\lang\csharp\src\apache\main\CodeGen\CodeGen.cs : 581
> change
> var mangledName = CodeGenUtil.Instance.Mangle(field.Name);
> to
> var mangledName = CodeGenUtil.Instance.Mangle(AsPropName(field.Name));
> where AsPropName function may look like following
> public string AsPropName(string name) {
> return Regex.Replace(name, @"^\S|_\S", match => 
> match.Value.Replace("_","").ToUpper());
> }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1938) Python (2) support for generating canonical forms of schema

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1938:
--
Component/s: python

> Python (2) support for generating canonical forms of schema
> ---
>
> Key: AVRO-1938
> URL: https://issues.apache.org/jira/browse/AVRO-1938
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: python
>Reporter: Erik Forsberg
>Priority: Major
>
> The python implementation(s) lack support for generating canonical forms of 
> schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1452) Problem when using AvroMultipleOutputs with multiple schemas

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1452:
--
Component/s: java

> Problem when using AvroMultipleOutputs with multiple schemas
> 
>
> Key: AVRO-1452
> URL: https://issues.apache.org/jira/browse/AVRO-1452
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.6
> Environment: Any Platform
>Reporter: Vladislav Spivak
>Priority: Major
>  Labels: easyfix
>
> When using multiple named outputs with different Key/Value Schemas, the last 
> provided schema overrides any previous schema definitions after first write 
> attempt. This happens due to issue with the following  code in 
> AvroMultipleOutputs.java:509
> /*begin*/
> Job job = new Job(context.getConfiguration());
>...
> setSchema(job, keySchema, valSchema);
> taskContext = createTaskAttemptContext(
>   job.getConfiguration(), context.getTaskAttemptID());
> /*end*/
> Every time this code runs, actual configuration instance passed to 
> createTaskAttemptContext remains the same, because Job constructor creates 
> new configuration copy only if it is not instanceof JobConf. This way we have 
> properties  "avro.schema.output.XXX" overwrote each time new 
> TaskAttemptContext is initialised and also mistakenly shared Configuration 
> instance for all TaskAttemptContextes
> Proposed fix:
> a) use "Job getInstance(Configuration conf)" or
> b) call "new Job(new Configuration(context.getConfiguration))"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1818) Avoid buffer copy in DeflateCodec.compress and decompress

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1818:
--
Component/s: java

> Avoid buffer copy in DeflateCodec.compress and decompress
> -
>
> Key: AVRO-1818
> URL: https://issues.apache.org/jira/browse/AVRO-1818
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Rohini Palaniswamy
>Assignee: Nandor Kollar
>Priority: Major
>
> One of our jobs reading avro hit OOM due to the buffer copy in compress and 
> decompress methods which is very inefficient. 
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/file/DeflateCodec.java#L71-L86
> {code}
> java.lang.OutOfMemoryError: Java heap space
>   at java.util.Arrays.copyOf(Arrays.java:3236)
>   at 
> java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:191)
>   at org.apache.avro.file.DeflateCodec.decompress(DeflateCodec.java:84)
> {code}
> I would suggest using a class that extends ByteArrrayOutputStream like 
> https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataOutputBuffer.java#L51-L53
> and do
> ByteBuffer result = ByteBuffer.wrap(buf.getData(), 0, buf.getLength());



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1559) Drop support for Ruby 1.8

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1559:
--
Component/s: ruby

> Drop support for Ruby 1.8
> -
>
> Key: AVRO-1559
> URL: https://issues.apache.org/jira/browse/AVRO-1559
> Project: Apache Avro
>  Issue Type: Wish
>  Components: ruby
>Affects Versions: 1.7.7
>Reporter: Willem van Bergen
>Assignee: Willem van Bergen
>Priority: Major
> Fix For: 1.9.0
>
> Attachments: AVRO-1559.patch
>
>
> - Ruby 1.8 is EOL, and is even security issues aren't addressed anymore. 
> - It is also getting hard to set up Ruby 1.8 to run the tests (e.g. on a 
> recent OSX, it won't compile without manual fiddling).
> - Handling character encodings in Ruby 1.9 is very different than Ruby 1.8. 
> Supporting both at the same time adds a lot of overhead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1742) Avro C# DataFileWriter Flush() does not flush the buffer to disk

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1742:
--
Component/s: csharp

> Avro C# DataFileWriter Flush() does not flush the buffer to disk
> 
>
> Key: AVRO-1742
> URL: https://issues.apache.org/jira/browse/AVRO-1742
> Project: Apache Avro
>  Issue Type: Bug
>  Components: csharp
>Reporter: Mika Ristimaki
>Priority: Minor
>
> In C# DataFileWriter.Flush() is implemented as 
> {code}
> public void Flush()
> {
> EnsureHeader();
> Sync();
> }
> {code}
> Is this by Avro spec or is this a bug. So should calling 
> DataFileWriter.Flush() just start a new Sync block  and not flush the file to 
> disc?
> In Java the implementation is
> {code}
>  @Override
>   public void flush() throws IOException {
> sync();
> vout.flush();
>   }
> {code}
> where vout is a BinaryEncoder. So I think the correct implementation in C# is
> {code}
> public void Flush()
> {
> EnsureHeader();
> Sync();
>_encoder.Flush()
> }
> {code}
> If someone can confirm my suspicion I'll try to contribute a fix in the near 
> future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1526) Add support for writing/reading json records using python3.

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1526:
--
Component/s: python

> Add support for writing/reading json records using python3.
> ---
>
> Key: AVRO-1526
> URL: https://issues.apache.org/jira/browse/AVRO-1526
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: python
>Reporter: Robert Chu
>Assignee: Christophe Taton
>Priority: Major
>
> Currently the avro python3 package only supports reading/writing binary data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1916) Building python version uses wrong version avro-tools.

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1916:
--
Component/s: python

> Building python version uses wrong version avro-tools.
> --
>
> Key: AVRO-1916
> URL: https://issues.apache.org/jira/browse/AVRO-1916
> Project: Apache Avro
>  Issue Type: Bug
>  Components: python
>Reporter: Niels Basjes
>Priority: Major
>
> During {{./build.sh test}} I see this during the build of {{lang/py}}
> {code}
> [ivy:retrieve]found org.apache.avro#avro-tools;1.9.0-SNAPSHOT in 
> apache-snapshots
> [ivy:retrieve] downloading 
> https://repository.apache.org/content/groups/snapshots/org/apache/avro/avro-tools/1.9.0-SNAPSHOT/avro-tools-1.9.0-20160122.173016-35.jar
>  ...
> {code}
> So apparently the py build phase uses an external version of avro-tools.
> What if I just updated avro-tools? Then it is quite possible the test will 
> pass while in reality it should have failed.
> I suspect the fix can be as simple as doing a {{mvn install}} on the java 
> avro-tools before building/testing the rest of the languages.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1527) support bzip2 in python avro tool

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1527:
--
Component/s: python

> support bzip2 in python avro tool
> -
>
> Key: AVRO-1527
> URL: https://issues.apache.org/jira/browse/AVRO-1527
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: python
>Affects Versions: 1.7.6
>Reporter: Eustache
>Priority: Minor
>  Labels: avro, bzip2, python
> Fix For: 1.7.9
>
> Attachments: AVRO-1527.diff
>
>
> The python tool to decode avro files is missing bzip2 support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1478) protobuf namespaces causing problem for avro c++ reader

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1478:
--
Component/s: c++

> protobuf namespaces causing problem for avro c++ reader
> ---
>
> Key: AVRO-1478
> URL: https://issues.apache.org/jira/browse/AVRO-1478
> Project: Apache Avro
>  Issue Type: Bug
>  Components: c++
>Reporter: George Baxter
>Priority: Major
>
> Utilizing the ProtobufData functionality to generate avro output, we run into 
> a complication when consuming this output using the c++ based avro reader.  
> Seems it doesn't much like the '$' of a nesting outer class that is inherent 
> with protocol buffers in java.
> Exception opening file for read:Invalid namespace: 
> com.xxx.base.message.MessageProtos$
> in 
> avro::DataFileReader* file_reader;
> file_reader = new 
> avro::DataFileReader(file_name.c_str());]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1208) Improve Trevni's performance on row-oriented data access

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1208:
--
Component/s: java

> Improve Trevni's performance on row-oriented data access
> 
>
> Key: AVRO-1208
> URL: https://issues.apache.org/jira/browse/AVRO-1208
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.7.3
>Reporter: Yin Huai
>Assignee: Yin Huai
>Priority: Major
> Attachments: AVRO-1208.1.patch, AVRO-1208.2.patch
>
>
> Trevni uses an 64KB internal buffer to store values of a column. When 
> accessing a column, it reads 64KB (if we do not consider compression and 
> checksum) data from the storage layer. However, when the table is accessed in 
> a row-oriented fashion (a entire row needs to be handed over to the upper 
> layer), in the worst case (a full table scan and values of this table are all 
> the same size), every 64KB data read can cause a seek.
> This jira is used to discuss if we should consider the data access pattern 
> mentioned above and if so, how to improve the performance of Trevni. 
> Row-oriented data processing engines, e.g. Hive, can benefit from this work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1192) trevni should support RLE encoding based on selectivity

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1192:
--
Component/s: java

> trevni should support RLE encoding based on selectivity
> ---
>
> Key: AVRO-1192
> URL: https://issues.apache.org/jira/browse/AVRO-1192
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.0
>Reporter: alex gemini
>Priority: Minor
>  Labels: compression, performance
>
> it would be nice if trevni support run-length encoding. columnar format 
> should first sort the columnar order based on selectivity .for higher 
> selectivity column trenvi should support run-length encoding .more 
> information will be found in paper "C-Store: A Column-oriented DBMS" section 
> 3.1 :Encoding Schemes 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1801) Generated code results in java.lang.ClassCastException

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1801:
--
Component/s: java

> Generated code results in java.lang.ClassCastException
> --
>
> Key: AVRO-1801
> URL: https://issues.apache.org/jira/browse/AVRO-1801
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Alex Baumgarten
>Priority: Major
>
> Create and compile avro schema:
> {
> "namespace": "com.abc.def.ghi.schema",
> "type": "record",
> "name": "MyDataRecord",
> "fields": [
> {"name": "Heading", "type": ["null", {"type": "fixed", "name": 
> "short", "size": 2}]}
> ]
> }
> which leads to compiled code:
> public void put(int field$, java.lang.Object value$) {
>   switch (field$) {
>   case 0: Heading = (com.abc.def.ghi.schema.short$)value$; break;
>   default: throw new org.apache.avro.AvroRuntimeException("Bad index");
>   }
> }
> When this function is called the type of value is 
> org.apache.avro.generic.GenericData$Fixed and when it tries to cast to the 
> short$ type it throws a java.lang.ClassCastException.
> This occurs when running the following code:
> SpecificDatumReader datumReader = new 
> SpecificDatumReader<>(MyDataRecord.class);
> DataFileReader dataFileReader = new DataFileReader<>(new 
> FsInput(inputAvroPath, configuration), datumReader);
> for (MyDataRecord record : dataFileReader) {
> // Do something with record
> }
> If I manually modify the generated code to extract the bytes from value$ and 
> call the constructor of short$ it works as expected. But this is not what is 
> generated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2029) Specific Data generated class missing Decimal Conversion

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2029:
--
Component/s: java

> Specific Data generated class missing Decimal Conversion
> 
>
> Key: AVRO-2029
> URL: https://issues.apache.org/jira/browse/AVRO-2029
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.2
>Reporter: Adrian McCague
>Priority: Major
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> Using 1.8.2-rc3
> Given a class generated with {{-bigDecimal}}, the generated class defines the 
> DECIMAL_CONVERSION but does not set it to a {{BigDecimal}} field index.
> Fields for illustration:
> {code}
> @Deprecated public java.lang.String id;
> @Deprecated public org.joda.time.DateTime timestamp;
> @Deprecated public java.lang.String applicationId;
> @Deprecated public java.math.BigDecimal amount;;
> ...
> protected static final 
> org.apache.avro.data.TimeConversions.TimestampConversion TIMESTAMP_CONVERSION 
> = new org.apache.avro.data.TimeConversions.TimestampConversion();
> protected static final org.apache.avro.Conversions.DecimalConversion 
> DECIMAL_CONVERSION = new org.apache.avro.Conversions.DecimalConversion();
> private static final org.apache.avro.Conversion[] conversions =
>   new org.apache.avro.Conversion[] {
>   null,
>   TIMESTAMP_CONVERSION,
>   null,
>   null, // Should be DECIMAL_CONVERSION
>   null,
>   null,
>   null,
>   null,
>   null
>   };
> {code}
> I am currently unsure of the impact of this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1196) trevni should add max-min value on file header

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1196:
--
Component/s: java

> trevni should add max-min value on file header
> --
>
> Key: AVRO-1196
> URL: https://issues.apache.org/jira/browse/AVRO-1196
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.0
>Reporter: alex gemini
>Priority: Minor
>  Labels: performance
>
> trevni's file header should contain max-min value for current block.It will 
> further support query engine predict push down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1666) avro.ipc.Responder logger is too noisy and have system/user error as WARN

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1666:
--
Component/s: java

> avro.ipc.Responder logger is too noisy and have system/user error as WARN
> -
>
> Key: AVRO-1666
> URL: https://issues.apache.org/jira/browse/AVRO-1666
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
>Reporter: Jacek Migdal
>Priority: Major
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I used avro ipc a lot and enjoy it. Great work! Would love to contribute back.
> We sometimes use avro-ipc exceptions to signal rare, but correct situations 
> (e.g. user session has ended). Because of the scale this cause tons of WARN 
> logs with stack traces from avro.icp.Responder:
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.avro/avro-ipc/1.7.5/org/apache/avro/ipc/Responder.java#156
> Though would like to exclude in log4j, I can't because I'm interested in 
> "system error" which signal real bug and are also on WARN level.
> Potential solutions that would make me happy:
> 1. Move "user error" to INFO level.
> 2. Move "system error" to ERROR level.
> 3. Have some option/flag to switch of INFO level.
>  
> Happy to write a patch for that, once I get blessing from core developer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1707) Java serialization readers/writers in generated Java classes

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1707:
--
Component/s: java

> Java serialization readers/writers in generated Java classes
> 
>
> Key: AVRO-1707
> URL: https://issues.apache.org/jira/browse/AVRO-1707
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.0
>Reporter: Zoltan Farkas
>Priority: Major
>
> the following static instances are declared in the generated classes:
>   private static final org.apache.avro.io.DatumWriter
> WRITER$ = new org.apache.avro.specific.SpecificDatumWriter(SCHEMA$);  
>   private static final org.apache.avro.io.DatumReader
> READER$ = new org.apache.avro.specific.SpecificDatumReader(SCHEMA$);  
>  the reaser/writer hold on to a reference to the "Creator Thread":
> "private final Thread creator;"
> which inhibits GC-ing thread locals... for this thread...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1466) Avro Tools fromjson (ie JsonDecoder) cannot parse "NaN" values created by tojson (ie JsonEncoder)

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1466:
--
Component/s: java

> Avro Tools fromjson (ie JsonDecoder) cannot parse "NaN" values created by 
> tojson (ie JsonEncoder)
> -
>
> Key: AVRO-1466
> URL: https://issues.apache.org/jira/browse/AVRO-1466
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.6
>Reporter: Jamie Olson
>Priority: Major
>
> Avro files containing NaN values are converted to JSON as a string "NaN" by 
> Avro Tools tojson command (ie JsonEncoder).  These values cannot be parsed by 
> the Avro Tools fromjson command (ie JsonDecoder.readDouble).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1321) Avro-ipc-tests in compile scope instead of test in Avro-mapred

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1321:
--
Component/s: java

> Avro-ipc-tests in compile scope instead of test in Avro-mapred
> --
>
> Key: AVRO-1321
> URL: https://issues.apache.org/jira/browse/AVRO-1321
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.3
>Reporter: Benyi Wang
>Priority: Trivial
>
> org.apache.avro:avro-ipc:1.7.3:tests is listed in "compile" scope instead of 
> "test" scope.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1703) Specific record should not only be determined by presence of SCHEMA$ field

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1703:
--
Component/s: java

> Specific record should not only be determined by presence of SCHEMA$ field
> --
>
> Key: AVRO-1703
> URL: https://issues.apache.org/jira/browse/AVRO-1703
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Marius Soutier
>Priority: Major
>  Labels: starter
>
> I want to use Avro from Scala, i.e. generate case classes from an Avro 
> schema. So far this is working fine except for one thing - fields in Scala 
> classes are always private. This doesn't work with Avro SpecificRecords (at 
> least when inferring the schema from the class) and results in the following 
> exception:
> org.apache.avro.AvroRuntimeException: java.lang.IllegalAccessException: Class 
> org.apache.avro.specific.SpecificData can not access a member of class 
>  with modifiers "private"
> The exception is thrown from the following line in 
> org.apache.avro.specific.SpecificData:
> schema = (Schema)(c.getDeclaredField("SCHEMA$").get(null));
> My suggestion would be to additionally check for a method called `getSchema` 
> and read the schema from that method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-2291) GenericData.Array not reusable after AVRO-2050

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-2291:
--
Component/s: java

> GenericData.Array not reusable after AVRO-2050
> --
>
> Key: AVRO-2291
> URL: https://issues.apache.org/jira/browse/AVRO-2291
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Reporter: Martin Jubelgas
>Priority: Minor
>
> The fix for AVRO-2050 left the reusing functionality of GenericData.Array 
> broken. By nulling all fields of the underlying array during 
> GenericData.Array.clear(), there was no more data to be reused in 
> GenericDatumReader.readArray().
> I've already posted a pull request to alleviate this issue in a backward 
> compatible matter as a possible solution. Comments welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1648) @Union annotation cannot handle the class on which its used

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1648:
--
Component/s: java

> @Union annotation cannot handle the class on which its used
> ---
>
> Key: AVRO-1648
> URL: https://issues.apache.org/jira/browse/AVRO-1648
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7
>Reporter: Sachin Goyal
>Priority: Major
>
> The bug is as shown in the following code:
> {code}
> // Having Base.class in the union results in infinite recursion
> @Union ({Base.class, Derived.class})
> // Having no Base.class in the union fails PolymorphicDO.obj2
> @Union ({Derived.class})
> private static class Base 
> {
>   Integer a = 5;
> }
> private static class Derived extends Base
> {
>   String b = "Foo";
> }
> private static class PolymorphicDO
> {
>   Base obj = new Derived();
>   Base obj2 = new Base();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1194) trevni should support delta encoding based on selectivity and data type

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1194:
--
Component/s: java

> trevni should support delta encoding based on selectivity and data type
> ---
>
> Key: AVRO-1194
> URL: https://issues.apache.org/jira/browse/AVRO-1194
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.0
>Reporter: alex gemini
>Priority: Minor
>
> it would be nice if trevni support delta encoding. columnar format should 
> first sort the columnar order based on selectivity .for middle selectivity 
> column such as timestamp trenvi should support delta encoding.more 
> information will be found in paper "C-Store: A Column-oriented DBMS" section 
> 3.1 :Encoding Schemes ,type 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1468) implement interface-based code-generation

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1468:
--
Component/s: java

> implement interface-based code-generation
> -
>
> Key: AVRO-1468
> URL: https://issues.apache.org/jira/browse/AVRO-1468
> Project: Apache Avro
>  Issue Type: New Feature
>  Components: java
>Reporter: Doug Cutting
>Priority: Major
>
> The current specific compiler generates a concrete class per record.  
> Instead, we might generate an interface per record that might be implemented 
> in different ways.  Implementations might include:
>  - A wrapper for a generic record.  This would permit the schema that is 
> compiled against to differ from that of the runtime instance.  A field that 
> was added since code-generation could be retained as records are filtered or 
> sorted and re-written.
>  - A concrete record.  This would be similar to the existing specific.
>  - A wrapped POJO.  The generated class could wrap a POJO using reflection.  
> Aliases could map between the schema used at compilation and that of the 
> POJO, so field and class names need not match exactly.  This would permit one 
> to evolve from a POJO-based Avro application to using generated code without 
> breaking existing code.
> This approach was first described in http://s.apache.org/AvroFlex



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1641) parser.java stack should expand quickly up to some threshold rather than start at the threshold

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1641:
--
Component/s: java

> parser.java stack should expand quickly up to some threshold rather than 
> start at the threshold
> ---
>
> Key: AVRO-1641
> URL: https://issues.apache.org/jira/browse/AVRO-1641
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.7.7, 1.8.0
>Reporter: Zoltan Farkas
>Assignee: Zoltan Farkas
>Priority: Minor
> Attachments: AVRO-1641.patch
>
>
> at Parser.java line 65 
> (https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/parsing/Parser.java#L65):
>  
> {noformat}
>  private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.max(stack.length,1024));
>   }
> {noformat}
> should probably be:
> {noformat}
> private void expandStack() {
> stack = Arrays.copyOf(stack, stack.length+Math.min(stack.length,1024));
>   }
> {noformat}
> This expansion probably is intended to grow exponentially up to 1024, and not 
> exponentially after 1024...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AVRO-1195) trevni should support dictionary encoding based on selectivity and data type

2018-12-29 Thread Thiruvalluvan M. G. (JIRA)


 [ 
https://issues.apache.org/jira/browse/AVRO-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvalluvan M. G. updated AVRO-1195:
--
Component/s: java

> trevni should support dictionary encoding based on selectivity and data type
> 
>
> Key: AVRO-1195
> URL: https://issues.apache.org/jira/browse/AVRO-1195
> Project: Apache Avro
>  Issue Type: Improvement
>  Components: java
>Affects Versions: 1.8.0
>Reporter: alex gemini
>Priority: Minor
>  Labels: compression, performance
>
> it would be nice if trevni support dictionary encoding. columnar format 
> should first sort the columnar order based on selectivity .for lower 
> selectivity column such as email or address trenvi should support dictionary 
> encoding .more information will be found in paper "C-Store: A Column-oriented 
> DBMS" section 3.1 :Encoding Schemes,type 4.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >