[jira] [Commented] (AVRO-987) Make Avro OSGi ready
[ https://issues.apache.org/jira/browse/AVRO-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867722#comment-13867722 ] Ioannis Canellos commented on AVRO-987: --- Since avro-tools is just a set of utilities and commands, which is something that is NOT meant to be shared across bundles, then even keeping it as a plain jar (not an osgi bundle) could make sense. In fact, even in normal OSGi bundles often of people prefer to keep their utilities in private packages. Personally, I don't have a strong preference on this, so I'd say that if renaming is not an option for a 1.7.x release, let's keep avro-tools a plain jar and make it an OSGi bundle in 1.8.x. Make Avro OSGi ready Key: AVRO-987 URL: https://issues.apache.org/jira/browse/AVRO-987 Project: Avro Issue Type: New Feature Components: java Reporter: Ioannis Canellos Assignee: Ioannis Canellos Fix For: 1.7.6 Attachments: AVRO-987-1_6_3-patch.txt, AVRO-987-exe.patch, AVRO-987-exe.patch, AVRO-987-patch-updated.txt, AVRO-987-patch.txt, AVRO-987.patch It would be really nice to be able to use Avro inside OSGi. To achieve this two things are required: i) Provide proper MANIFEST.MF. ii) Deal with potential class loading issues. Avro uses Class.forName a lot and that is not very OSGi friendly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (AVRO-1434) ObjectCreator.cs is not thread safe
David Taylor created AVRO-1434: -- Summary: ObjectCreator.cs is not thread safe Key: AVRO-1434 URL: https://issues.apache.org/jira/browse/AVRO-1434 Project: Avro Issue Type: Bug Components: csharp Affects Versions: 1.7.5 Environment: Windows Reporter: David Taylor Public methods ObjectCreator.GetType() assign to shared fields without locks. This causes unpredictable behaviour in a multi-threaded application. The easiest fix is to simply remove the shared variables as they appear to exist as a potential performance improvement but constructing local variables seems to add no significant overhead. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (AVRO-1434) ObjectCreator.cs is not thread safe
[ https://issues.apache.org/jira/browse/AVRO-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Taylor updated AVRO-1434: --- Attachment: ObjectCreator.cs.diff Suggested patch attach. ObjectCreator.cs is not thread safe --- Key: AVRO-1434 URL: https://issues.apache.org/jira/browse/AVRO-1434 Project: Avro Issue Type: Bug Components: csharp Affects Versions: 1.7.5 Environment: Windows Reporter: David Taylor Attachments: ObjectCreator.cs.diff Public methods ObjectCreator.GetType() assign to shared fields without locks. This causes unpredictable behaviour in a multi-threaded application. The easiest fix is to simply remove the shared variables as they appear to exist as a potential performance improvement but constructing local variables seems to add no significant overhead. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (AVRO-625) RPC: permit out-of-order responses
[ https://issues.apache.org/jira/browse/AVRO-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868074#comment-13868074 ] Doug Cutting commented on AVRO-625: --- Perhaps we should use SPDY for Avro? SPDY provides secure sessions that multiplex many streams over a single connection. A handshake need only be performed once per session. Each request can use a new stream, so responses can arrive out-of-order. Both synchronous and asynchronous APIs could easily be supported. For Java, Jetty provides a client and server implementation. There SPDY libraries for C, Ruby Python. For C# the best I can find is the one referenced from: http://mail-archives.apache.org/mod_mbox/tomcat-dev/201205.mbox/%3C002601cd3a90$af3e74e0$0dbb5ea0$@preis...@t-online.de%3E RPC: permit out-of-order responses -- Key: AVRO-625 URL: https://issues.apache.org/jira/browse/AVRO-625 Project: Avro Issue Type: New Feature Components: java, spec Reporter: Doug Cutting Assignee: Doug Cutting It should be possible, when using a stateful, connection-based transport, for a client to complete a second request over a connection before the first request has returned. In other words, responses should be permitted to arrive out-of-order. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (AVRO-975) Support RPC in C#
[ https://issues.apache.org/jira/browse/AVRO-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868075#comment-13868075 ] Doug Cutting commented on AVRO-975: --- The current C# implementation only has a SocketTransceiver implementation, which does not support secure connections. If you want to interoperate with Java and other implementations, then we should probably implement an HttpTransceiver for C# and use HTTPS. This should be straightforward to implement using C#'s standard HttpWebRequest. Support RPC in C# - Key: AVRO-975 URL: https://issues.apache.org/jira/browse/AVRO-975 Project: Avro Issue Type: New Feature Components: csharp Affects Versions: 1.6.1 Reporter: Jeff Hammerbacher Assignee: Mark Lamley Fix For: 1.7.6 Attachments: 975.patch, Avro-975-00.patch, Avro-975-complate5.patch, Avro-975-complete2.patch, Avro-975-complete3.patch, Avro-975-complete4.patch, Avro975-complete.patch, Castle.Core.dll, buildtask.patch, errors-and-remote-protocols.975.diff, java-compat.diff, propagate-exception.diff, timeout-unhandled-exception.diff, types_ipc.diff, types_main.diff -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (AVRO-1434) ObjectCreator.cs is not thread safe
[ https://issues.apache.org/jira/browse/AVRO-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Cutting updated AVRO-1434: --- Fix Version/s: 1.7.6 Assignee: David Taylor Status: Patch Available (was: Open) This looks reasonable to me. I'll commit it soon if there are no objections. ObjectCreator.cs is not thread safe --- Key: AVRO-1434 URL: https://issues.apache.org/jira/browse/AVRO-1434 Project: Avro Issue Type: Bug Components: csharp Affects Versions: 1.7.5 Environment: Windows Reporter: David Taylor Assignee: David Taylor Fix For: 1.7.6 Attachments: ObjectCreator.cs.diff Public methods ObjectCreator.GetType() assign to shared fields without locks. This causes unpredictable behaviour in a multi-threaded application. The easiest fix is to simply remove the shared variables as they appear to exist as a potential performance improvement but constructing local variables seems to add no significant overhead. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (AVRO-1382) Support for python3
[ https://issues.apache.org/jira/browse/AVRO-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868148#comment-13868148 ] ASF subversion and git services commented on AVRO-1382: --- Commit 1557225 from [~cutting] in branch 'avro/trunk' [ https://svn.apache.org/r1557225 ] AVRO-1382. Add support for Python3. Contributed by Christophe Taton. Support for python3 --- Key: AVRO-1382 URL: https://issues.apache.org/jira/browse/AVRO-1382 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.5 Reporter: Christophe Taton Attachments: AVRO-1382.20131203-001922.diff, AVRO-1382.20140101-123233-0800.diff, AVRO-1382.20140107-231626-0800.diff, AVRO-1382.20140108-165947-0800.diff, AVRO-1382.20140109-232110-0800.diff Hi, I'd need to use Avro from Python3, which would require essentially the following changes, which I am happy to contribute: - rewrite except statements according to new syntax - rewrite print statements according to new syntax - basestring becomes str - update some imports (StringIO becomes io.StringIO, httplib becomes http.client) This would apparently require branching the python code to maintain a version for python2 and a separate version for python3. Any thoughts on how to approach this? Thanks! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (AVRO-1382) Support for python3
[ https://issues.apache.org/jira/browse/AVRO-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868155#comment-13868155 ] ASF subversion and git services commented on AVRO-1382: --- Commit 1557227 from [~cutting] in branch 'avro/trunk' [ https://svn.apache.org/r1557227 ] AVRO-1382. Ignore generated files. Support for python3 --- Key: AVRO-1382 URL: https://issues.apache.org/jira/browse/AVRO-1382 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.5 Reporter: Christophe Taton Fix For: 1.7.6 Attachments: AVRO-1382.20131203-001922.diff, AVRO-1382.20140101-123233-0800.diff, AVRO-1382.20140107-231626-0800.diff, AVRO-1382.20140108-165947-0800.diff, AVRO-1382.20140109-232110-0800.diff Hi, I'd need to use Avro from Python3, which would require essentially the following changes, which I am happy to contribute: - rewrite except statements according to new syntax - rewrite print statements according to new syntax - basestring becomes str - update some imports (StringIO becomes io.StringIO, httplib becomes http.client) This would apparently require branching the python code to maintain a version for python2 and a separate version for python3. Any thoughts on how to approach this? Thanks! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (AVRO-1382) Support for python3
[ https://issues.apache.org/jira/browse/AVRO-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Cutting updated AVRO-1382: --- Resolution: Fixed Fix Version/s: 1.7.6 Assignee: Christophe Taton Status: Resolved (was: Patch Available) I committed this. Thanks, Christophe! Support for python3 --- Key: AVRO-1382 URL: https://issues.apache.org/jira/browse/AVRO-1382 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.5 Reporter: Christophe Taton Assignee: Christophe Taton Fix For: 1.7.6 Attachments: AVRO-1382.20131203-001922.diff, AVRO-1382.20140101-123233-0800.diff, AVRO-1382.20140107-231626-0800.diff, AVRO-1382.20140108-165947-0800.diff, AVRO-1382.20140109-232110-0800.diff Hi, I'd need to use Avro from Python3, which would require essentially the following changes, which I am happy to contribute: - rewrite except statements according to new syntax - rewrite print statements according to new syntax - basestring becomes str - update some imports (StringIO becomes io.StringIO, httplib becomes http.client) This would apparently require branching the python code to maintain a version for python2 and a separate version for python3. Any thoughts on how to approach this? Thanks! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (AVRO-1382) Support for python3
[ https://issues.apache.org/jira/browse/AVRO-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868166#comment-13868166 ] ASF subversion and git services commented on AVRO-1382: --- Commit 1557229 from [~cutting] in branch 'avro/trunk' [ https://svn.apache.org/r1557229 ] AVRO-1382. Add missing license header. Support for python3 --- Key: AVRO-1382 URL: https://issues.apache.org/jira/browse/AVRO-1382 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.5 Reporter: Christophe Taton Assignee: Christophe Taton Fix For: 1.7.6 Attachments: AVRO-1382.20131203-001922.diff, AVRO-1382.20140101-123233-0800.diff, AVRO-1382.20140107-231626-0800.diff, AVRO-1382.20140108-165947-0800.diff, AVRO-1382.20140109-232110-0800.diff Hi, I'd need to use Avro from Python3, which would require essentially the following changes, which I am happy to contribute: - rewrite except statements according to new syntax - rewrite print statements according to new syntax - basestring becomes str - update some imports (StringIO becomes io.StringIO, httplib becomes http.client) This would apparently require branching the python code to maintain a version for python2 and a separate version for python3. Any thoughts on how to approach this? Thanks! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (AVRO-987) Make Avro OSGi ready
[ https://issues.apache.org/jira/browse/AVRO-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868176#comment-13868176 ] ASF subversion and git services commented on AVRO-987: -- Commit 1557231 from [~cutting] in branch 'avro/trunk' [ https://svn.apache.org/r1557231 ] AVRO-987. For back-compatibililty, don't create tools jar as a bundle. Make Avro OSGi ready Key: AVRO-987 URL: https://issues.apache.org/jira/browse/AVRO-987 Project: Avro Issue Type: New Feature Components: java Reporter: Ioannis Canellos Assignee: Ioannis Canellos Fix For: 1.7.6 Attachments: AVRO-987-1_6_3-patch.txt, AVRO-987-exe.patch, AVRO-987-exe.patch, AVRO-987-patch-updated.txt, AVRO-987-patch.txt, AVRO-987.patch It would be really nice to be able to use Avro inside OSGi. To achieve this two things are required: i) Provide proper MANIFEST.MF. ii) Deal with potential class loading issues. Avro uses Class.forName a lot and that is not very OSGi friendly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (AVRO-987) Make Avro OSGi ready
[ https://issues.apache.org/jira/browse/AVRO-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Cutting updated AVRO-987: -- Resolution: Fixed Status: Resolved (was: Patch Available) Okay, I just switched avro-tools back to not be a bundle for compatibility. We might make it a bundle in 1.8, as you suggest. Make Avro OSGi ready Key: AVRO-987 URL: https://issues.apache.org/jira/browse/AVRO-987 Project: Avro Issue Type: New Feature Components: java Reporter: Ioannis Canellos Assignee: Ioannis Canellos Fix For: 1.7.6 Attachments: AVRO-987-1_6_3-patch.txt, AVRO-987-exe.patch, AVRO-987-exe.patch, AVRO-987-patch-updated.txt, AVRO-987-patch.txt, AVRO-987.patch It would be really nice to be able to use Avro inside OSGi. To achieve this two things are required: i) Provide proper MANIFEST.MF. ii) Deal with potential class loading issues. Avro uses Class.forName a lot and that is not very OSGi friendly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (AVRO-1435) Convert avro-tools jar to be a bundle.
Doug Cutting created AVRO-1435: -- Summary: Convert avro-tools jar to be a bundle. Key: AVRO-1435 URL: https://issues.apache.org/jira/browse/AVRO-1435 Project: Avro Issue Type: Improvement Components: java Reporter: Doug Cutting All of the Avro jars except avro-tools were converted to bundles in AVRO-987. We should probably convert avro-tools-nodeps to be a bundle, since some folks might wish to consume it that way. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (AVRO-1435) Convert avro-tools jar to be a bundle.
[ https://issues.apache.org/jira/browse/AVRO-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Cutting updated AVRO-1435: --- Attachment: AVRO-1435.patch Here's a patch that does this by creating a new module for avro-tools-nodeps. This is slightly incompatible, since the jar will now be named avro-tools-nodeps.X.Y.Z.jar rather than avro-tools.X.Y.Z-nodeps.jar. This will require folks who upgrade to modify their dependencies in their pom.xml accordingly. Convert avro-tools jar to be a bundle. -- Key: AVRO-1435 URL: https://issues.apache.org/jira/browse/AVRO-1435 Project: Avro Issue Type: Improvement Components: java Reporter: Doug Cutting Attachments: AVRO-1435.patch All of the Avro jars except avro-tools were converted to bundles in AVRO-987. We should probably convert avro-tools-nodeps to be a bundle, since some folks might wish to consume it that way. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (AVRO-987) Make Avro OSGi ready
[ https://issues.apache.org/jira/browse/AVRO-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868192#comment-13868192 ] Doug Cutting commented on AVRO-987: --- I filed AVRO-1435 for the issue of switching avro-tools to be a bundle. Make Avro OSGi ready Key: AVRO-987 URL: https://issues.apache.org/jira/browse/AVRO-987 Project: Avro Issue Type: New Feature Components: java Reporter: Ioannis Canellos Assignee: Ioannis Canellos Fix For: 1.7.6 Attachments: AVRO-987-1_6_3-patch.txt, AVRO-987-exe.patch, AVRO-987-exe.patch, AVRO-987-patch-updated.txt, AVRO-987-patch.txt, AVRO-987.patch It would be really nice to be able to use Avro inside OSGi. To achieve this two things are required: i) Provide proper MANIFEST.MF. ii) Deal with potential class loading issues. Avro uses Class.forName a lot and that is not very OSGi friendly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (AVRO-1435) Convert avro-tools jar to be a bundle.
[ https://issues.apache.org/jira/browse/AVRO-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Cutting updated AVRO-1435: --- Fix Version/s: 1.8.0 Assignee: Doug Cutting Status: Patch Available (was: Open) Convert avro-tools jar to be a bundle. -- Key: AVRO-1435 URL: https://issues.apache.org/jira/browse/AVRO-1435 Project: Avro Issue Type: Improvement Components: java Reporter: Doug Cutting Assignee: Doug Cutting Fix For: 1.8.0 Attachments: AVRO-1435.patch All of the Avro jars except avro-tools were converted to bundles in AVRO-987. We should probably convert avro-tools-nodeps to be a bundle, since some folks might wish to consume it that way. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (AVRO-987) Make Avro OSGi ready
[ https://issues.apache.org/jira/browse/AVRO-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868198#comment-13868198 ] Hudson commented on AVRO-987: - SUCCESS: Integrated in AvroJava #425 (See [https://builds.apache.org/job/AvroJava/425/]) AVRO-987. For back-compatibililty, don't create tools jar as a bundle. (cutting: rev 1557231) * /avro/trunk/lang/java/tools/pom.xml Make Avro OSGi ready Key: AVRO-987 URL: https://issues.apache.org/jira/browse/AVRO-987 Project: Avro Issue Type: New Feature Components: java Reporter: Ioannis Canellos Assignee: Ioannis Canellos Fix For: 1.7.6 Attachments: AVRO-987-1_6_3-patch.txt, AVRO-987-exe.patch, AVRO-987-exe.patch, AVRO-987-patch-updated.txt, AVRO-987-patch.txt, AVRO-987.patch It would be really nice to be able to use Avro inside OSGi. To achieve this two things are required: i) Provide proper MANIFEST.MF. ii) Deal with potential class loading issues. Avro uses Class.forName a lot and that is not very OSGi friendly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (AVRO-1382) Support for python3
[ https://issues.apache.org/jira/browse/AVRO-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868199#comment-13868199 ] Hudson commented on AVRO-1382: -- SUCCESS: Integrated in AvroJava #425 (See [https://builds.apache.org/job/AvroJava/425/]) AVRO-1382. Add missing license header. (cutting: rev 1557229) * /avro/trunk/share/test/schemas/echo.avdl AVRO-1382. Ignore generated files. (cutting: rev 1557227) * /avro/trunk/lang/py3 * /avro/trunk/lang/py3/avro * /avro/trunk/lang/py3/avro/tests AVRO-1382. Add support for Python3. Contributed by Christophe Taton. (cutting: rev 1557225) * /avro/trunk/CHANGES.txt * /avro/trunk/build.sh * /avro/trunk/lang/py3 * /avro/trunk/lang/py3/avro * /avro/trunk/lang/py3/avro/__init__.py * /avro/trunk/lang/py3/avro/datafile.py * /avro/trunk/lang/py3/avro/io.py * /avro/trunk/lang/py3/avro/ipc.py * /avro/trunk/lang/py3/avro/protocol.py * /avro/trunk/lang/py3/avro/schema.py * /avro/trunk/lang/py3/avro/tests * /avro/trunk/lang/py3/avro/tests/av_bench.py * /avro/trunk/lang/py3/avro/tests/gen_interop_data.py * /avro/trunk/lang/py3/avro/tests/run_tests.py * /avro/trunk/lang/py3/avro/tests/sample_http_client.py * /avro/trunk/lang/py3/avro/tests/sample_http_server.py * /avro/trunk/lang/py3/avro/tests/test_datafile.py * /avro/trunk/lang/py3/avro/tests/test_datafile_interop.py * /avro/trunk/lang/py3/avro/tests/test_io.py * /avro/trunk/lang/py3/avro/tests/test_ipc.py * /avro/trunk/lang/py3/avro/tests/test_protocol.py * /avro/trunk/lang/py3/avro/tests/test_schema.py * /avro/trunk/lang/py3/avro/tests/test_script.py * /avro/trunk/lang/py3/avro/tests/txsample_http_client.py * /avro/trunk/lang/py3/avro/tests/txsample_http_server.py * /avro/trunk/lang/py3/avro/tool.py * /avro/trunk/lang/py3/avro/txipc.py * /avro/trunk/lang/py3/scripts * /avro/trunk/lang/py3/scripts/avro * /avro/trunk/lang/py3/setup.py * /avro/trunk/share/test/schemas/echo.avdl Support for python3 --- Key: AVRO-1382 URL: https://issues.apache.org/jira/browse/AVRO-1382 Project: Avro Issue Type: Bug Components: python Affects Versions: 1.7.5 Reporter: Christophe Taton Assignee: Christophe Taton Fix For: 1.7.6 Attachments: AVRO-1382.20131203-001922.diff, AVRO-1382.20140101-123233-0800.diff, AVRO-1382.20140107-231626-0800.diff, AVRO-1382.20140108-165947-0800.diff, AVRO-1382.20140109-232110-0800.diff Hi, I'd need to use Avro from Python3, which would require essentially the following changes, which I am happy to contribute: - rewrite except statements according to new syntax - rewrite print statements according to new syntax - basestring becomes str - update some imports (StringIO becomes io.StringIO, httplib becomes http.client) This would apparently require branching the python code to maintain a version for python2 and a separate version for python3. Any thoughts on how to approach this? Thanks! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (AVRO-1434) ObjectCreator.cs is not thread safe
[ https://issues.apache.org/jira/browse/AVRO-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868286#comment-13868286 ] ASF subversion and git services commented on AVRO-1434: --- Commit 1557250 from [~cutting] in branch 'avro/trunk' [ https://svn.apache.org/r1557250 ] AVRO-1434. C#: Fix ObjectCreator to be thread safe. Contributed by David Taylor. ObjectCreator.cs is not thread safe --- Key: AVRO-1434 URL: https://issues.apache.org/jira/browse/AVRO-1434 Project: Avro Issue Type: Bug Components: csharp Affects Versions: 1.7.5 Environment: Windows Reporter: David Taylor Assignee: David Taylor Fix For: 1.7.6 Attachments: ObjectCreator.cs.diff Public methods ObjectCreator.GetType() assign to shared fields without locks. This causes unpredictable behaviour in a multi-threaded application. The easiest fix is to simply remove the shared variables as they appear to exist as a potential performance improvement but constructing local variables seems to add no significant overhead. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (AVRO-1434) ObjectCreator.cs is not thread safe
[ https://issues.apache.org/jira/browse/AVRO-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Cutting updated AVRO-1434: --- Resolution: Fixed Status: Resolved (was: Patch Available) I committed this. Thanks, David. ObjectCreator.cs is not thread safe --- Key: AVRO-1434 URL: https://issues.apache.org/jira/browse/AVRO-1434 Project: Avro Issue Type: Bug Components: csharp Affects Versions: 1.7.5 Environment: Windows Reporter: David Taylor Assignee: David Taylor Fix For: 1.7.6 Attachments: ObjectCreator.cs.diff Public methods ObjectCreator.GetType() assign to shared fields without locks. This causes unpredictable behaviour in a multi-threaded application. The easiest fix is to simply remove the shared variables as they appear to exist as a potential performance improvement but constructing local variables seems to add no significant overhead. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (AVRO-1124) RESTful service for holding schemas
[ https://issues.apache.org/jira/browse/AVRO-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868292#comment-13868292 ] Jonathan Herriott commented on AVRO-1124: - Maybe I am missing something, but stated in the original ticket it read that the RepositoryServer should be load balanced. The thing that appears to be an issue for me, but maybe I have a misunderstanding, is that each Producer must register the schemas it is using with the SchemaRepository. If this is true, and the RepositoryServer is load balanced, it means schema IDs will slowly diverge between RepositoryServers if an increment count is used for IDs. This issue is solved with MD5s, which is what you stated as being done at LinkedIn. Another issue I foresee is if load balanced, and that particular schema only got registered with one instance of the RepositoryServer but not with another instance, and the Consumer asks the server it hasn't been registered with, then it will fail. How do you guys manage or mitigate these risks? Do you first run something to register schemas with all RepositoryServers? Does the job just get processed again and *hope* that it doesn't hit the same RepositoryServer again? RESTful service for holding schemas --- Key: AVRO-1124 URL: https://issues.apache.org/jira/browse/AVRO-1124 Project: Avro Issue Type: New Feature Reporter: Jay Kreps Assignee: Jay Kreps Attachments: AVRO-1124-can-read-with.patch, AVRO-1124-draft.patch, AVRO-1124-validators-preliminary.patch, AVRO-1124.patch, AVRO-1124.patch Motivation: It is nice to be able to pass around data in serialized form but still know the exact schema that was used to serialize it. The overhead of storing the schema with each record is too high unless the individual records are very large. There are workarounds for some common cases: in the case of files a schema can be stored once with a file of many records amortizing the per-record cost, and in the case of RPC the schema can be negotiated ahead of time and used for many requests. For other uses, though it is nice to be able to pass a reference to a given schema using a small id and allow this to be looked up. Since only a small number of schemas are likely to be active for a given data source, these can easily be cached, so the number of remote lookups is very small (one per active schema version). Basically this would consist of two things: 1. A simple REST service that stores and retrieves schemas 2. Some helper java code for fetching and caching schemas for people using the registry We have used something like this at LinkedIn for a few years now, and it would be nice to standardize this facility to be able to build up common tooling around it. This proposal will be based on what we have, but we can change it as ideas come up. The facilities this provides are super simple, basically you can register a schema which gives back a unique id for it or you can query for a schema. There is almost no code, and nothing very complex. The contract is that before emitting/storing a record you must first publish its schema to the registry or know that it has already been published (by checking your cache of published schemas). When reading you check your cache and if you don't find the id/schema pair there you query the registry to look it up. I will explain some of the nuances in more detail below. An added benefit of such a repository is that it makes a few other things possible: 1. A graphical browser of the various data types that are currently used and all their previous forms. 2. Automatic enforcement of compatibility rules. Data is always compatible in the sense that the reader will always deserialize it (since they are using the same schema as the writer) but this does not mean it is compatible with the expectations of the reader. For example if an int field is changed to a string that will almost certainly break anyone relying on that field. This definition of compatibility can differ for different use cases and should likely be pluggable. Here is a description of one of our uses of this facility at LinkedIn. We use this to retain a schema with log data end-to-end from the producing app to various real-time consumers as well as a set of resulting AvroFile in Hadoop. This schema metadata can then be used to auto-create hive tables (or add new fields to existing tables), or inferring pig fields, all without manual intervention. One important definition of compatibility that is nice to enforce is compatibility with historical data for a given table. Log data is usually loaded in an append-only manner, so if someone changes an int field in a particular data set to be a string, tools like pig or hive
[jira] [Commented] (AVRO-1434) ObjectCreator.cs is not thread safe
[ https://issues.apache.org/jira/browse/AVRO-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868350#comment-13868350 ] Hudson commented on AVRO-1434: -- SUCCESS: Integrated in AvroJava #426 (See [https://builds.apache.org/job/AvroJava/426/]) AVRO-1434. C#: Fix ObjectCreator to be thread safe. Contributed by David Taylor. (cutting: rev 1557250) * /avro/trunk/CHANGES.txt * /avro/trunk/lang/csharp/src/apache/main/Specific/ObjectCreator.cs ObjectCreator.cs is not thread safe --- Key: AVRO-1434 URL: https://issues.apache.org/jira/browse/AVRO-1434 Project: Avro Issue Type: Bug Components: csharp Affects Versions: 1.7.5 Environment: Windows Reporter: David Taylor Assignee: David Taylor Fix For: 1.7.6 Attachments: ObjectCreator.cs.diff Public methods ObjectCreator.GetType() assign to shared fields without locks. This causes unpredictable behaviour in a multi-threaded application. The easiest fix is to simply remove the shared variables as they appear to exist as a potential performance improvement but constructing local variables seems to add no significant overhead. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[VOTE] Avro release 1.7.6 (rc0)
I have created a candidate build for Avro release 1.7.6. Changes are listed at: http://s.apache.org/avro176 Please download the sources, check them, and vote. http://people.apache.org/~cutting/avro-1.7.6-rc0/ The Maven staging repository is at: https://repository.apache.org/content/repositories/orgapacheavro-1000/ Thanks in advance for voting! Doug
[jira] [Commented] (AVRO-1124) RESTful service for holding schemas
[ https://issues.apache.org/jira/browse/AVRO-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868379#comment-13868379 ] Jonathan Herriott commented on AVRO-1124: - Ah, I missed the part about the pluggable key value store, so the onus is put on another technology to handle the consistency and the Repository just needs to handle the distribution. RESTful service for holding schemas --- Key: AVRO-1124 URL: https://issues.apache.org/jira/browse/AVRO-1124 Project: Avro Issue Type: New Feature Reporter: Jay Kreps Assignee: Jay Kreps Attachments: AVRO-1124-can-read-with.patch, AVRO-1124-draft.patch, AVRO-1124-validators-preliminary.patch, AVRO-1124.patch, AVRO-1124.patch Motivation: It is nice to be able to pass around data in serialized form but still know the exact schema that was used to serialize it. The overhead of storing the schema with each record is too high unless the individual records are very large. There are workarounds for some common cases: in the case of files a schema can be stored once with a file of many records amortizing the per-record cost, and in the case of RPC the schema can be negotiated ahead of time and used for many requests. For other uses, though it is nice to be able to pass a reference to a given schema using a small id and allow this to be looked up. Since only a small number of schemas are likely to be active for a given data source, these can easily be cached, so the number of remote lookups is very small (one per active schema version). Basically this would consist of two things: 1. A simple REST service that stores and retrieves schemas 2. Some helper java code for fetching and caching schemas for people using the registry We have used something like this at LinkedIn for a few years now, and it would be nice to standardize this facility to be able to build up common tooling around it. This proposal will be based on what we have, but we can change it as ideas come up. The facilities this provides are super simple, basically you can register a schema which gives back a unique id for it or you can query for a schema. There is almost no code, and nothing very complex. The contract is that before emitting/storing a record you must first publish its schema to the registry or know that it has already been published (by checking your cache of published schemas). When reading you check your cache and if you don't find the id/schema pair there you query the registry to look it up. I will explain some of the nuances in more detail below. An added benefit of such a repository is that it makes a few other things possible: 1. A graphical browser of the various data types that are currently used and all their previous forms. 2. Automatic enforcement of compatibility rules. Data is always compatible in the sense that the reader will always deserialize it (since they are using the same schema as the writer) but this does not mean it is compatible with the expectations of the reader. For example if an int field is changed to a string that will almost certainly break anyone relying on that field. This definition of compatibility can differ for different use cases and should likely be pluggable. Here is a description of one of our uses of this facility at LinkedIn. We use this to retain a schema with log data end-to-end from the producing app to various real-time consumers as well as a set of resulting AvroFile in Hadoop. This schema metadata can then be used to auto-create hive tables (or add new fields to existing tables), or inferring pig fields, all without manual intervention. One important definition of compatibility that is nice to enforce is compatibility with historical data for a given table. Log data is usually loaded in an append-only manner, so if someone changes an int field in a particular data set to be a string, tools like pig or hive that expect static columns will be unusable. Even using plain-vanilla map/reduce processing data where columns and types change willy nilly is painful. However the person emitting this kind of data may not know all the details of compatible schema evolution. We use the schema repository to validate that any change made to a schema don't violate the compatibility model, and reject the update if it does. We do this check both at run time, and also as part of the ant task that generates specific record code (as an early warning). Some details to consider: Deployment This can just be programmed against the servlet API and deploy as a standard war. You have lots of instances and load balance traffic over them. Persistence The storage needs are not very heavy. The clients are expected to cache the id=schema mapping, and