Author: blue
Date: Thu Dec 3 21:35:44 2015
New Revision: 1717850
URL: http://svn.apache.org/viewvc?rev=1717850&view=rev
Log:
AVRO-1747: JavaScript: Add implementation.
A few features:
+ Fast! Typically twice as fast as JSON with much smaller encodings.
+ Full Avro support, including recursive schemas, sort order, and
evolution.
+ Serialization of arbitrary JavaScript objects via logical types.
+ Unopinionated 64-bit integer compatibility.
+ No dependencies, it even runs in the browser.
The previous API is included with deprecation warnings (this adds
`underscore` as a dependency).
Added:
avro/trunk/lang/js/LICENSE
avro/trunk/lang/js/NOTICE
avro/trunk/lang/js/README.md
avro/trunk/lang/js/doc/
avro/trunk/lang/js/doc/API.md
avro/trunk/lang/js/doc/Advanced-usage.md
avro/trunk/lang/js/doc/Home.md
avro/trunk/lang/js/etc/
avro/trunk/lang/js/etc/browser/
avro/trunk/lang/js/etc/browser/avro.js
avro/trunk/lang/js/etc/browser/crypto.js
avro/trunk/lang/js/etc/deprecated/
avro/trunk/lang/js/etc/deprecated/Gruntfile.js
- copied, changed from r1717830, avro/trunk/lang/js/Gruntfile.js
avro/trunk/lang/js/etc/deprecated/README
- copied, changed from r1717830, avro/trunk/lang/js/README
avro/trunk/lang/js/etc/deprecated/test_validator.js
- copied, changed from r1717830, avro/trunk/lang/js/test/validator.js
avro/trunk/lang/js/etc/deprecated/validator.js
- copied, changed from r1717830, avro/trunk/lang/js/lib/validator.js
avro/trunk/lang/js/lib/files.js
avro/trunk/lang/js/lib/index.js
avro/trunk/lang/js/lib/schemas.js
avro/trunk/lang/js/lib/utils.js
avro/trunk/lang/js/test/dat/
avro/trunk/lang/js/test/dat/Id.avsc
avro/trunk/lang/js/test/dat/Person.avsc
avro/trunk/lang/js/test/dat/person-10.avro
avro/trunk/lang/js/test/dat/person-10.avro.raw
avro/trunk/lang/js/test/dat/person-10.no-codec.avro
avro/trunk/lang/js/test/test_files.js
avro/trunk/lang/js/test/test_schemas.js
avro/trunk/lang/js/test/test_utils.js
Removed:
avro/trunk/lang/js/Gruntfile.js
avro/trunk/lang/js/README
avro/trunk/lang/js/lib/validator.js
avro/trunk/lang/js/test/validator.js
Modified:
avro/trunk/CHANGES.txt
avro/trunk/lang/js/build.sh
avro/trunk/lang/js/package.json
Modified: avro/trunk/CHANGES.txt
URL:
http://svn.apache.org/viewvc/avro/trunk/CHANGES.txt?rev=1717850&r1=1717849&r2=1717850&view=diff
==============================================================================
--- avro/trunk/CHANGES.txt (original)
+++ avro/trunk/CHANGES.txt Thu Dec 3 21:35:44 2015
@@ -61,6 +61,8 @@ Avro 1.8.0 (10 August 2014)
AVRO-1672. Java: Add date/time logical types and conversions. (blue)
+ AVRO-1747. JS: Add Javascript IO implementation. (Matthieu Monsch via blue)
+
OPTIMIZATIONS
IMPROVEMENTS
Added: avro/trunk/lang/js/LICENSE
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/LICENSE?rev=1717850&view=auto
==============================================================================
--- avro/trunk/lang/js/LICENSE (added)
+++ avro/trunk/lang/js/LICENSE Thu Dec 3 21:35:44 2015
@@ -0,0 +1,202 @@
+
+ Apache License
+ Version 2.0, January 2004
+ http://www.apache.org/licenses/
+
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+ 1. Definitions.
+
+ "License" shall mean the terms and conditions for use, reproduction,
+ and distribution as defined by Sections 1 through 9 of this document.
+
+ "Licensor" shall mean the copyright owner or entity authorized by
+ the copyright owner that is granting the License.
+
+ "Legal Entity" shall mean the union of the acting entity and all
+ other entities that control, are controlled by, or are under common
+ control with that entity. For the purposes of this definition,
+ "control" means (i) the power, direct or indirect, to cause the
+ direction or management of such entity, whether by contract or
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
+ outstanding shares, or (iii) beneficial ownership of such entity.
+
+ "You" (or "Your") shall mean an individual or Legal Entity
+ exercising permissions granted by this License.
+
+ "Source" form shall mean the preferred form for making modifications,
+ including but not limited to software source code, documentation
+ source, and configuration files.
+
+ "Object" form shall mean any form resulting from mechanical
+ transformation or translation of a Source form, including but
+ not limited to compiled object code, generated documentation,
+ and conversions to other media types.
+
+ "Work" shall mean the work of authorship, whether in Source or
+ Object form, made available under the License, as indicated by a
+ copyright notice that is included in or attached to the work
+ (an example is provided in the Appendix below).
+
+ "Derivative Works" shall mean any work, whether in Source or Object
+ form, that is based on (or derived from) the Work and for which the
+ editorial revisions, annotations, elaborations, or other modifications
+ represent, as a whole, an original work of authorship. For the purposes
+ of this License, Derivative Works shall not include works that remain
+ separable from, or merely link (or bind by name) to the interfaces of,
+ the Work and Derivative Works thereof.
+
+ "Contribution" shall mean any work of authorship, including
+ the original version of the Work and any modifications or additions
+ to that Work or Derivative Works thereof, that is intentionally
+ submitted to Licensor for inclusion in the Work by the copyright owner
+ or by an individual or Legal Entity authorized to submit on behalf of
+ the copyright owner. For the purposes of this definition, "submitted"
+ means any form of electronic, verbal, or written communication sent
+ to the Licensor or its representatives, including but not limited to
+ communication on electronic mailing lists, source code control systems,
+ and issue tracking systems that are managed by, or on behalf of, the
+ Licensor for the purpose of discussing and improving the Work, but
+ excluding communication that is conspicuously marked or otherwise
+ designated in writing by the copyright owner as "Not a Contribution."
+
+ "Contributor" shall mean Licensor and any individual or Legal Entity
+ on behalf of whom a Contribution has been received by Licensor and
+ subsequently incorporated within the Work.
+
+ 2. Grant of Copyright License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ copyright license to reproduce, prepare Derivative Works of,
+ publicly display, publicly perform, sublicense, and distribute the
+ Work and such Derivative Works in Source or Object form.
+
+ 3. Grant of Patent License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ (except as stated in this section) patent license to make, have made,
+ use, offer to sell, sell, import, and otherwise transfer the Work,
+ where such license applies only to those patent claims licensable
+ by such Contributor that are necessarily infringed by their
+ Contribution(s) alone or by combination of their Contribution(s)
+ with the Work to which such Contribution(s) was submitted. If You
+ institute patent litigation against any entity (including a
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
+ or a Contribution incorporated within the Work constitutes direct
+ or contributory patent infringement, then any patent licenses
+ granted to You under this License for that Work shall terminate
+ as of the date such litigation is filed.
+
+ 4. Redistribution. You may reproduce and distribute copies of the
+ Work or Derivative Works thereof in any medium, with or without
+ modifications, and in Source or Object form, provided that You
+ meet the following conditions:
+
+ (a) You must give any other recipients of the Work or
+ Derivative Works a copy of this License; and
+
+ (b) You must cause any modified files to carry prominent notices
+ stating that You changed the files; and
+
+ (c) You must retain, in the Source form of any Derivative Works
+ that You distribute, all copyright, patent, trademark, and
+ attribution notices from the Source form of the Work,
+ excluding those notices that do not pertain to any part of
+ the Derivative Works; and
+
+ (d) If the Work includes a "NOTICE" text file as part of its
+ distribution, then any Derivative Works that You distribute must
+ include a readable copy of the attribution notices contained
+ within such NOTICE file, excluding those notices that do not
+ pertain to any part of the Derivative Works, in at least one
+ of the following places: within a NOTICE text file distributed
+ as part of the Derivative Works; within the Source form or
+ documentation, if provided along with the Derivative Works; or,
+ within a display generated by the Derivative Works, if and
+ wherever such third-party notices normally appear. The contents
+ of the NOTICE file are for informational purposes only and
+ do not modify the License. You may add Your own attribution
+ notices within Derivative Works that You distribute, alongside
+ or as an addendum to the NOTICE text from the Work, provided
+ that such additional attribution notices cannot be construed
+ as modifying the License.
+
+ You may add Your own copyright statement to Your modifications and
+ may provide additional or different license terms and conditions
+ for use, reproduction, or distribution of Your modifications, or
+ for any such Derivative Works as a whole, provided Your use,
+ reproduction, and distribution of the Work otherwise complies with
+ the conditions stated in this License.
+
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
+ any Contribution intentionally submitted for inclusion in the Work
+ by You to the Licensor shall be under the terms and conditions of
+ this License, without any additional terms or conditions.
+ Notwithstanding the above, nothing herein shall supersede or modify
+ the terms of any separate license agreement you may have executed
+ with Licensor regarding such Contributions.
+
+ 6. Trademarks. This License does not grant permission to use the trade
+ names, trademarks, service marks, or product names of the Licensor,
+ except as required for reasonable and customary use in describing the
+ origin of the Work and reproducing the content of the NOTICE file.
+
+ 7. Disclaimer of Warranty. Unless required by applicable law or
+ agreed to in writing, Licensor provides the Work (and each
+ Contributor provides its Contributions) on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied, including, without limitation, any warranties or conditions
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+ PARTICULAR PURPOSE. You are solely responsible for determining the
+ appropriateness of using or redistributing the Work and assume any
+ risks associated with Your exercise of permissions under this License.
+
+ 8. Limitation of Liability. In no event and under no legal theory,
+ whether in tort (including negligence), contract, or otherwise,
+ unless required by applicable law (such as deliberate and grossly
+ negligent acts) or agreed to in writing, shall any Contributor be
+ liable to You for damages, including any direct, indirect, special,
+ incidental, or consequential damages of any character arising as a
+ result of this License or out of the use or inability to use the
+ Work (including but not limited to damages for loss of goodwill,
+ work stoppage, computer failure or malfunction, or any and all
+ other commercial damages or losses), even if such Contributor
+ has been advised of the possibility of such damages.
+
+ 9. Accepting Warranty or Additional Liability. While redistributing
+ the Work or Derivative Works thereof, You may choose to offer,
+ and charge a fee for, acceptance of support, warranty, indemnity,
+ or other liability obligations and/or rights consistent with this
+ License. However, in accepting such obligations, You may act only
+ on Your own behalf and on Your sole responsibility, not on behalf
+ of any other Contributor, and only if You agree to indemnify,
+ defend, and hold each Contributor harmless for any liability
+ incurred by, or claims asserted against, such Contributor by reason
+ of your accepting any such warranty or additional liability.
+
+ END OF TERMS AND CONDITIONS
+
+ APPENDIX: How to apply the Apache License to your work.
+
+ To apply the Apache License to your work, attach the following
+ boilerplate notice, with the fields enclosed by brackets "[]"
+ replaced with your own identifying information. (Don't include
+ the brackets!) The text should be enclosed in the appropriate
+ comment syntax for the file format. We also recommend that a
+ file or class name and description of purpose be included on the
+ same "printed page" as the copyright notice for easier
+ identification within third-party archives.
+
+ Copyright [yyyy] [name of copyright owner]
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
Added: avro/trunk/lang/js/NOTICE
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/NOTICE?rev=1717850&view=auto
==============================================================================
--- avro/trunk/lang/js/NOTICE (added)
+++ avro/trunk/lang/js/NOTICE Thu Dec 3 21:35:44 2015
@@ -0,0 +1,5 @@
+Apache Avro
+Copyright 2011-2015 The Apache Software Foundation
+
+This product includes software developed at
+The Apache Software Foundation (http://www.apache.org/).
Added: avro/trunk/lang/js/README.md
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/README.md?rev=1717850&view=auto
==============================================================================
--- avro/trunk/lang/js/README.md (added)
+++ avro/trunk/lang/js/README.md Thu Dec 3 21:35:44 2015
@@ -0,0 +1,103 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+
+# Avro-js
+
+Pure JavaScript implementation of the [Avro
specification](https://avro.apache.org/docs/current/spec.html).
+
+
+## Features
+
++ Fast! Typically twice as fast as JSON with much smaller encodings.
++ Full Avro support, including recursive schemas, sort order, and evolution.
++ Serialization of arbitrary JavaScript objects via logical types.
++ Unopinionated 64-bit integer compatibility.
++ No dependencies, `avro-js` even runs in the browser.
+
+
+## Installation
+
+```bash
+$ npm install avro-js
+```
+
+`avro-js` is compatible with all versions of [node.js][] since `0.11` and major
+browsers via [browserify][].
+
+
+## Documentation
+
+See `doc/` folder.
+
+
+## Examples
+
+Inside a node.js module, or using browserify:
+
+```javascript
+var avro = require('avro-js');
+```
+
++ Encode and decode objects:
+
+ ```javascript
+ // We can declare a schema inline:
+ var type = avro.parse({
+ name: 'Pet',
+ type: 'record',
+ fields: [
+ {name: 'kind', type: {name: 'Kind', type: 'enum', symbols: ['CAT',
'DOG']}},
+ {name: 'name', type: 'string'}
+ ]
+ });
+ var pet = {kind: 'CAT', name: 'Albert'};
+ var buf = type.toBuffer(pet); // Serialized object.
+ var obj = type.fromBuffer(buf); // {kind: 'CAT', name: 'Albert'}
+ ```
+
++ Generate random instances of a schema:
+
+ ```javascript
+ // We can also parse a JSON-stringified schema:
+ var type = avro.parse('{"type": "fixed", "name": "Id", "size": 4}');
+ var id = type.random(); // E.g. Buffer([48, 152, 2, 123])
+ ```
+
++ Check whether an object fits a given schema:
+
+ ```javascript
+ // Or we can specify a path to a schema file (not in the browser):
+ var type = avro.parse('./Person.avsc');
+ var person = {name: 'Bob', address: {city: 'Cambridge', zip: '02139'}};
+ var status = type.isValid(person); // Boolean status.
+ ```
+
++ Get a [readable stream][readable-stream] of decoded records from an Avro
+ container file (not in the browser):
+
+ ```javascript
+ avro.createFileDecoder('./records.avro')
+ .on('metadata', function (type) { /* `type` is the writer's type. */ })
+ .on('data', function (record) { /* Do something with the record. */ });
+ ```
+
+
+[node.js]: https://nodejs.org/en/
+[readable-stream]:
https://nodejs.org/api/stream.html#stream_class_stream_readable
+[browserify]: http://browserify.org/
Modified: avro/trunk/lang/js/build.sh
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/build.sh?rev=1717850&r1=1717849&r2=1717850&view=diff
==============================================================================
--- avro/trunk/lang/js/build.sh (original)
+++ avro/trunk/lang/js/build.sh Thu Dec 3 21:35:44 2015
@@ -20,21 +20,21 @@ set -e
cd `dirname "$0"`
case "$1" in
- test)
- npm install
- grunt test
- ;;
-
- dist)
- ;;
-
- clean)
- ;;
-
- *)
- echo "Usage: $0 {test|dist|clean}"
- exit 1
-
+ test)
+ npm install
+ npm test
+ ;;
+ dist)
+ npm pack
+ mkdir -p ../../dist/js
+ mv avro-js-*.tgz ../../dist/js
+ ;;
+ clean)
+ rm -rf node_modules
+ ;;
+ *)
+ echo "Usage: $0 {test|dist|clean}" >&2
+ exit 1
esac
exit 0
Added: avro/trunk/lang/js/doc/API.md
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/doc/API.md?rev=1717850&view=auto
==============================================================================
--- avro/trunk/lang/js/doc/API.md (added)
+++ avro/trunk/lang/js/doc/API.md Thu Dec 3 21:35:44 2015
@@ -0,0 +1,758 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+
++ [Parsing schemas](#parsing-schemas)
++ [Avro types](#avro-types)
++ [Records](#records)
++ [Files and streams](#files-and-streams)
+
+
+## Parsing schemas
+
+### `parse(schema, [opts])`
+
++ `schema` {Object|String} An Avro schema, represented by one of:
+ + A string containing a JSON-stringified schema (e.g. `'["null", "int"]'`).
+ + A path to a file containing a JSON-stringified schema (e.g.
+ `'./Schema.avsc'`).
+ + A decoded schema object (e.g. `{type: 'array', items: 'int'}`).
++ `opts` {Object} Parsing options. The following keys are currently supported:
+ + `logicalTypes` {Object} Optional dictionary of
+ [`LogicalType`](#class-logicaltypeattrs-opts-types). This can be used to
+ support serialization and deserialization of arbitrary native objects.
+ + `namespace` {String} Optional parent namespace.
+ + `registry` {Object} Optional registry of predefined type names. This can
+ for example be used to override the types used for primitives.
+ + `typeHook(attrs, opts)` {Function} Function called before each new type is
+ instantiated. The relevant decoded schema is available as first argument
+ and the parsing options as second. This function can optionally return a
+ type which will then be used in place of the result of parsing `schema`.
+ See below for more details.
+
+Parse a schema and return an instance of the corresponding
+[`Type`](#class-type).
+
+Using the `typeHook` option, it is possible to customize the parsing process by
+intercepting the creation of any type. As a sample use-case, we show below how
+to replace the default `EnumType` (which decodes `enum` values into strings)
+with a `LongType` (which will decode the `enum`'s values into integers). This
+can be useful when the `enum` already exists as a JavaScript object (e.g. if it
+was generated by TypeScript).
+
+```javascript
+var longType = new avro.types.LongType();
+function typeHook(schema) {
+ if (schema.type === 'enum') {
+ // For simplicity, we don't do any bound checking here but we could by
+ // implementing a "bounded long" logical type and returning that instead.
+ return longType();
+ }
+ // Falling through will cause the default type to be used.
+}
+```
+
+To use it:
+
+```javascript
+// Assume we already have an "enum" with each symbol.
+var PETS = {CAT: 0, DOG: 1};
+
+// We can provide our hook when parsing a schema.
+var type = avro.parse({
+ name: 'Pet',
+ type: 'enum',
+ symbols: ['CAT', 'DOG']
+}, {typeHook: typeHook});
+
+// And encode integer enum values directly.
+var buf = type.toBuffer(PETS.CAT);
+```
+
+Finally, type hooks work well with logical types (for example to dynamically
+add `logicalType` attributes to a schema).
+
+
+## Avro types
+
+All the classes below are available in the `avro.types` namespace:
+
++ [`Type`](#class-type)
++ Primitive types:
+ + `BooleanType`
+ + `BytesType`
+ + `DoubleType`
+ + `FloatType`
+ + `IntType`
+ + [`LongType`](#class-longtypeattrs-opts)
+ + `NullType`
+ + `StringType`
++ Complex types:
+ + [`ArrayType`](#class-arraytypeattrs-opts)
+ + [`EnumType`](#class-enumtypeattrs-opts)
+ + [`FixedType`](#class-fixedtypeattrs-opts)
+ + [`MapType`](#class-maptypeattrs-opts)
+ + [`RecordType`](#class-recordtypeattrs-opts)
+ + [`UnionType`](#class-uniontypeattrs-opts)
++ [`LogicalType`](#class-logicaltypeattrs-opts-types)
+
+
+### Class `Type`
+
+"Abstract" base Avro type class; all implementations inherit from it.
+
+##### `type.decode(buf, [pos,] [resolver])`
+
++ `buf` {Buffer} Buffer to read from.
++ `pos` {Number} Offset to start reading from.
++ `resolver` {Resolver} Optional resolver to decode values serialized from
+ another schema. See [`createResolver`](#typecreateresolverwritertype) for how
+ to create one.
+
+Returns `{value: value, offset: offset}` if `buf` contains a valid encoding of
+`type` (`value` being the decoded value, and `offset` the new offset in the
+buffer). Returns `{value: undefined, offset: -1}` when the buffer is too short.
+
+##### `type.encode(val, buf, [pos])`
+
++ `val` {...} The value to encode. An error will be raised if this isn't a
+ valid `type` value.
++ `buf` {Buffer} Buffer to write to.
++ `pos` {Number} Offset to start writing at.
+
+Encode a value into an existing buffer. If enough space was available in `buf`,
+returns the new (non-negative) offset, otherwise returns `-N` where `N` is the
+(positive) number of bytes by which the buffer was short.
+
+##### `type.fromBuffer(buf, [resolver,] [noCheck])`
+
++ `buf` {Buffer} Bytes containing a serialized value of `type`.
++ `resolver` {Resolver} To decode values serialized from another schema. See
+ [`createResolver`](#typecreateresolverwritertype) for how to create an
+ resolver.
++ `noCheck` {Boolean} Do not check that the entire buffer has been read. This
+ can be useful when using an resolver which only decodes fields at the start
of
+ the buffer, allowing decoding to bail early and yield significant performance
+ speedups.
+
+Deserialize a buffer into its corresponding value.
+
+##### `type.toBuffer(val)`
+
++ `val` {...} The value to encode. It must be a valid `type` value.
+
+Returns a `Buffer` containing the Avro serialization of `val`.
+
+##### `type.fromString(str)`
+
++ `str` {String} String representing a JSON-serialized object.
+
+Deserialize a JSON-encoded object of `type`.
+
+##### `type.toString([val])`
+
++ `val` {...} The value to serialize. If not specified, this method will return
+ a human-friendly description of `type`.
+
+Serialize an object into a JSON-encoded string.
+
+##### `type.isValid(val, [opts])`
+
++ `val` {...} The value to validate.
++ `opts` {Object} Options:
+ + `errorHook(path, any, type)` {Function} Function called when an invalid
+ value is encountered. When an invalid value causes its parent values to
+ also be invalid, the latter do not trigger a callback. `path` will be an
+ array of strings identifying where the mismatch occurred. See below for a
+ few examples.
+
+Check whether `val` is a valid `type` value.
+
+For complex schemas, it can be difficult to figure out which part(s) of `val`
+are invalid. The `errorHook` option provides access to more information about
+these mismatches. We illustrate a few use-cases below:
+
+```javascript
+// A sample schema.
+var personType = avro.parse({
+ type: 'record',
+ name: 'Person',
+ fields: [
+ {name: 'age', type: 'int'},
+ {name: 'names', type: {type: 'array', items: 'string'}}
+ ]
+});
+
+// A corresponding invalid record.
+var invalidPerson = {age: null, names: ['ann', 3, 'bob']};
+```
+
+As a first use-case, we use the `errorHook` to implement a function to gather
+all invalid paths a value (if any):
+
+```javascript
+function getInvalidPaths(type, val) {
+ var paths = [];
+ type.isValid(val, {errorHook: function (path) { paths.push(path.join()); }});
+ return paths;
+}
+
+var paths = getInvalidPaths(personType, invalidPerson); // == ['age',
'names,1']
+```
+
+We can also implement an `assertValid` function which throws a helpful error on
+the first mismatch encountered (if any):
+
+```javascript
+var util = require('util');
+
+function assertValid(type, val) {
+ return type.isValid(val, {errorHook: hook});
+
+ function hook(path, any) {
+ throw new Error(util.format('invalid %s: %j', path.join(), any));
+ }
+}
+
+try {
+ assertValid(personType, invalidPerson); // Will throw.
+} catch (err) {
+ // err.message === 'invalid age: null'
+}
+```
+
+##### `type.clone(val, [opts])`
+
++ `val` {...} The object to copy.
++ `opts` {Object} Options:
+ + `coerceBuffers` {Boolean} Allow coercion of JSON buffer representations
+ into actual `Buffer` objects.
+ + `fieldHook(field, any, type)` {Function} Function called when each record
+ field is populated. The value returned by this function will be used
+ instead of `any`. `field` is the current `Field` instance and `type` the
+ parent type.
+ + `wrapUnions` {Boolean} Avro's JSON representation expects all union values
+ to be wrapped inside objects. Setting this parameter to `true` will try to
+ wrap unwrapped union values into their first matching type.
+
+Deep copy a value of `type`.
+
+##### `type.compare(val1, val2)`
+
++ `val1` {...} Value of `type`.
++ `val2` {...} Value of `type`.
+
+Returns `0` if both values are equal according to their [sort
+order][sort-order], `-1` if the first is smaller than the second , and `1`
+otherwise. Comparing invalid values is undefined behavior.
+
+##### `type.compareBuffers(buf1, buf2)`
+
++ `buf1` {Buffer} `type` value bytes.
++ `buf2` {Buffer} `type` value bytes.
+
+Similar to [`compare`](#typecompareval1-val2), but doesn't require decoding
+values.
+
+##### `type.createResolver(writerType)`
+
++ `writerType` {Type} Writer type.
+
+Create a resolver that can be be passed to the `type`'s
+[`decode`](#typedecodebuf-pos-resolver) and
+[`fromBuffer`](#typefrombufferbuf-resolver-nocheck) methods. This will enable
+decoding values which had been serialized using `writerType`, according to the
+Avro [resolution rules][schema-resolution]. If the schemas are incompatible,
+this method will throw an error.
+
+For example, assume we have the following two versions of a type:
+
+```javascript
+// A schema's first version.
+var v1 = avro.parse({
+ name: 'Person',
+ type: 'record',
+ fields: [
+ {name: 'name', type: 'string'},
+ {name: 'age', type: 'int'}
+ ]
+});
+
+// The updated version.
+var v2 = avro.parse({
+ type: 'record',
+ name: 'Person',
+ fields: [
+ {
+ name: 'name', type: [
+ 'string',
+ {
+ name: 'Name',
+ type: 'record',
+ fields: [
+ {name: 'first', type: 'string'},
+ {name: 'last', type: 'string'}
+ ]
+ }
+ ]
+ },
+ {name: 'phone', type: ['null', 'string'], default: null}
+ ]
+});
+```
+
+The two types are compatible since the `name` field is present in both (the
+`string` can be promoted to the new `union`) and the new `phone` field has a
+default value.
+
+```javascript
+// We can therefore create a resolver.
+var resolver = v2.createResolver(v1);
+
+// And pass it whenever we want to decode from the old type to the new.
+var buf = v1.toBuffer({name: 'Ann', age: 25});
+var obj = v2.fromBuffer(buf, resolver); // === {name: {string: 'Ann'}, phone:
null}
+```
+
+See the [advanced usage page](Advanced-usage) for more details on how schema
+evolution can be used to significantly speed up decoding.
+
+##### `type.random()`
+
+Returns a random value of `type`.
+
+##### `type.getName()`
+
+Returns `type`'s fully qualified name if it exists, `undefined` otherwise.
+
+##### `type.getSchema([noDeref])`
+
++ `noDeref` {Boolean} Do not dereference any type names.
+
+Returns `type`'s [canonical schema][canonical-schema] (as a string). This can
+be used to compare schemas for equality.
+
+##### `type.getFingerprint([algorithm])`
+
++ `algorithm` {String} Algorithm used to generate the schema's [fingerprint][].
+ Defaults to `md5`. *Not supported in the browser.*
+
+##### `Type.__reset(size)`
+
++ `size` {Number} New buffer size in bytes.
+
+This method resizes the internal buffer used to encode all types. You should
+only ever need to call this if you are encoding very large values and need to
+reclaim memory.
+
+
+#### Class `LongType(attrs, [opts])`
+
++ `attrs` {Object} Decoded type attributes.
++ `opts` {Object} Parsing options.
+
+##### `LongType.using(methods, [noUnpack])`
+
++ `methods` {Object} Method implementations dictionary keyed by method name,
+ see below for details on each of the functions to implement.
++ `noUnpack` {Boolean} Do not automatically unpack bytes before passing them to
+ the above `methods`' `fromBuffer` function and pack bytes returned by its
+ `toBuffer` function.
+
+This function provides a way to support arbitrary long representations. Doing
+so requires implementing the following methods (a few examples are available
+[here][custom-long]):
+
++ `fromBuffer(buf)`
+
+ + `buf` {Buffer} Encoded long. If `noUnpack` is off (the default), `buf` will
+ be an 8-byte buffer containing the long's unpacked representation.
+ Otherwise, `buf` will contain a variable length buffer with the long's
+ packed representation.
+
+ This method should return the corresponding decoded long.
+
++ `toBuffer(val)`
+
+ + `val` {...} Decoded long.
+
+ If `noUnpack` is off (the default), this method should return an 8-byte
+ buffer with the `long`'s unpacked representation. Otherwise, `toBuffer`
+ should return an already packed buffer (of variable length).
+
++ `fromJSON(any)`
+
+ + `any` {Number|...} Parsed value. To ensure that the `fromString` method
+ works correctly on data JSON-serialized according to the Avro spec, this
+ method should at least support numbers as input.
+
+ This method should return the corresponding decoded long.
+
+ It might also be useful to support other kinds of input (typically the output
+ of the long implementation's `toJSON` method) to enable serializing large
+ numbers without loss of precision (at the cost of violating the Avro spec).
+
++ `toJSON(val)`
+
+ + `val` {...} Decoded long.
+
+ This method should return the `long`'s JSON representation.
+
++ `isValid(val, [opts])`
+
+ See [`Type.isValid`](#typeisvalidval-opts).
+
++ `compare(val1, val2)`
+
+ See [`Type.compare`](#typecompareval1-val2).
+
+
+#### Class `ArrayType(attrs, [opts])`
+
++ `attrs` {Object} Decoded type attributes.
++ `opts` {Object} Parsing options.
+
+##### `type.getItemsType()`
+
+The type of the array's items.
+
+
+#### Class `EnumType(attrs, [opts])`
+
++ `attrs` {Object} Decoded type attributes.
++ `opts` {Object} Parsing options.
+
+##### `type.getAliases()`
+
+Optional type aliases. These are used when adapting a schema from another type.
+
+##### `type.getSymbols()`
+
+Returns a copy of the type's symbols (an array of strings representing the
+`enum`'s valid values).
+
+
+#### Class `FixedType(attrs, [opts])`
+
++ `attrs` {Object} Decoded type attributes.
++ `opts` {Object} Parsing options.
+
+##### `type.getAliases()`
+
+Optional type aliases. These are used when adapting a schema from another type.
+
+##### `type.getSize()`
+
+The size in bytes of instances of this type.
+
+
+#### Class `MapType(attrs, [opts])`
+
++ `attrs` {Object} Decoded type attributes.
++ `opts` {Object} Parsing options.
+
+##### `type.getValuesType()`
+
+The type of the map's values (keys are always strings).
+
+
+#### Class `RecordType(attrs, [opts])`
+
++ `attrs` {Object} Decoded type attributes.
++ `opts` {Object} Parsing options.
+
+##### `type.getAliases()`
+
+Optional type aliases. These are used when adapting a schema from another type.
+
+##### `type.getFields()`
+
+Returns a copy of the array of fields contained in this record. Each field is
+an object with the following methods:
+
++ `getAliases()`
++ `getDefault()`
++ `getName()`
++ `getOrder()`
++ `getType()`
+
+##### `type.getRecordConstructor()`
+
+The [`Record`](#class-record) constructor for instances of this type.
+
+
+#### Class `UnionType(attrs, [opts])`
+
++ `attrs` {Object} Decoded type attributes.
++ `opts` {Object} Parsing options.
+
+##### `type.getTypes()`
+
+The possible types that this union can take.
+
+
+#### Class `LogicalType(attrs, [opts,] [Types])`
+
++ `attrs` {Object} Decoded type attributes.
++ `opts` {Object} Parsing options.
++ `Types` {Array} Optional of type classes. If specified, only these will be
+ accepted as underlying type.
+
+"Abstract class" used to implement custom native types.
+
+##### `type.getUnderlyingType()`
+
+Get the underlying Avro type. This can be useful when a logical type can
+support different underlying types.
+
+To implement a custom logical type, the steps are:
+
++ Call `LogicalType`'s constructor inside your own subclass' to make sure the
+ underlying type is property set up. Throwing an error anywhere inside your
+ constructor will prevent the logical type from being used (the underlying
+ type will be used instead).
++ Extend `LogicalType` in your own subclass (typically using `util.inherits`).
++ Override the methods below (prefixed with an underscore because they are
+ internal to the class that defines them and should only be called by the
+ internal `LogicalType` methods).
+
+See [here][logical-types] for a couple sample implementations.
+
+##### `type._fromValue(val)`
+
++ `val` {...} A value deserialized by the underlying type.
+
+This function should return the final, wrapped, value.
+
+##### `type._toValue(any)`
+
++ `any` {...} A wrapped value.
+
+This function should return a value which can be serialized by the underlying
+type.
+
+##### `type._resolve(type)`
+
++ `type` {Type} The writer's type.
+
+This function should return:
+
++ `undefined` if the writer's values cannot be converted.
++ Otherwise, a function which converts a value deserialized by the writer's
+ type into a wrapped value for the current type.
+
+
+## Records
+
+Each [`RecordType`](#class-recordtypeattrs-opts) generates a corresponding
+`Record` constructor when its schema is parsed. It is available using the
+`RecordType`'s `getRecordConstructor` methods. This helps make decoding and
+encoding records more efficient.
+
+All prototype methods below are prefixed with `$` to avoid clashing with an
+existing record field (`$` is a valid identifier in JavaScript, but not in
+Avro).
+
+#### Class `Record(...)`
+
+Calling the constructor directly can sometimes be a convenient shortcut to
+instantiate new records of a given type. In particular, it will correctly
+initialize all the missing record's fields with their default values.
+
+##### `record.$clone([opts])`
+
++ `opts` {Object} See [`type.clone`](#typecloneval-opts).
+
+Deep copy the record.
+
+##### `record.$compare(val)`
+
++ `val` {Record} See [`type.compare`](#typecompareval1-val2).
+
+Compare the record to another.
+
+##### `record.$getType()`
+
+Get the record's `type`.
+
+##### `record.$isValid([opts])`
+
++ `opts` {Object} See [`type.isValid`](#typeisvalidval-opts).
+
+Check whether the record is valid.
+
+##### `record.$toBuffer()`
+
+Return binary encoding of record.
+
+##### `record.$toString()`
+
+Return JSON-stringified record.
+
+##### `Record.getType()`
+
+Convenience class method to get the record's type.
+
+
+## Files and streams
+
+*Not available in the browser.*
+
+The following convenience functions are available for common operations on
+container files:
+
+#### `createFileDecoder(path, [opts])`
+
++ `path` {String} Path to Avro container file.
++ `opts` {Object} Decoding options, passed to
+ [`BlockDecoder`](Api#class-blockdecoderopts).
+
+Returns a readable stream of decoded objects from an Avro container file.
+
+#### `createFileEncoder(path, schema, [opts])`
+
++ `path` {String} Destination path.
++ `schem` {Object|String|Type} Type used to serialize.
++ `opts` {Object} Encoding options, passed to
+ [`BlockEncoder`](Api#class-blockencoderschem-opts).
+
+Returns a writable stream of objects. These will end up serialized into an Avro
+container file.
+
+#### `extractFileHeader(path, [opts])`
+
++ `path` {String} Path to Avro container file.
++ `opts` {Object} Options:
+ + `decode` {Boolean} Decode schema and codec metadata (otherwise they will be
+ returned as bytes). Defaults to `true`.
+
+Extract header from an Avro container file synchronously. If no header is
+present (i.e. the path doesn't point to a valid Avro container file), `null` is
+returned.
+
+
+For more specific use-cases, the following stream classes are available in the
+`avro.streams` namespace:
+
++ [`BlockDecoder`](#blockdecoderopts)
++ [`RawDecoder`](#rawdecoderschem-opts)
++ [`BlockEncoder`](#blockencoderschem-opts)
++ [`RawEncoder`](#rawencoderschem-opts)
+
+
+#### Class `BlockDecoder([opts])`
+
++ `opts` {Object} Decoding options. Available keys:
+ + `codecs` {Object} Dictionary of decompression functions, keyed by codec
+ name. A decompression function has the signature `fn(compressedData, cb)`
where
+ `compressedData` is a buffer of compressed data, and must call `cb(err,
+ uncompressedData)` on completion. The default contains handlers for the
+ `'null'` and `'deflate'` codecs.
+ + `decode` {Boolean} Whether to decode records before returning them.
+ Defaults to `true`.
+ + `parseOpts` {Object} Options passed when parsing the writer's schema.
+
+A duplex stream which decodes bytes coming from on Avro object container file.
+
+##### Event `'metadata'`
+
++ `type` {Type} The type used to write the file.
++ `codec` {String} The codec's name.
++ `header` {Object} The file's header, containing in particular the raw schema
+ and codec.
+
+##### Event `'data'`
+
++ `data` {...} Decoded element or raw bytes.
+
+##### `BlockDecoder.getDefaultCodecs()`
+
+Get built-in decompression functions (currently `null` and `deflate`).
+
+
+#### Class `RawDecoder(schema, [opts])`
+
++ `schema` {Object|String|Type} Writer schema. Required since the input doesn't
+ contain a header. Argument parsing logic is the same as for
+ [`parse`](#parseschema-opts).
++ `opts` {Object} Decoding options. Available keys:
+ + `decode` {Boolean} Whether to decode records before returning them.
+ Defaults to `true`.
+
+A duplex stream which can be used to decode a stream of serialized Avro objects
+with no headers or blocks.
+
+##### Event `'data'`
+
++ `data` {...} Decoded element or raw bytes.
+
+
+#### Class `BlockEncoder(schema, [opts])`
+
++ `schema` {Object|String|Type} Schema used for encoding. Argument parsing
+ logic is the same as for [`parse`](#parseschema-opts).
++ `opts` {Object} Encoding options. Available keys:
+ + `blockSize` {Number} Maximum uncompressed size of each block data. A new
+ block will be started when this number is exceeded. If it is too small to
+ fit a single element, it will be increased appropriately. Defaults to 64kB.
+ + `codec` {String} Name of codec to use for encoding. See `codecs` option
+ below to support arbitrary compression functions.
+ + `codecs` {Object} Dictionary of compression functions, keyed by codec
+ name. A compression function has the signature `fn(uncompressedData, cb)`
where
+ `uncompressedData` is a buffer of uncompressed data, and must call `cb(err,
+ compressedData)` on completion. The default contains handlers for the
+ `'null'` and `'deflate'` codecs.
+ + `omitHeader` {Boolean} Don't emit the header. This can be useful when
+ appending to an existing container file. Defaults to `false`.
+ + `syncMarker` {Buffer} 16 byte buffer to use as synchronization marker
+ inside the file. If unspecified, a random value will be generated.
+
+A duplex stream to create Avro container object files.
+
+##### Event `'data'`
+
++ `data` {Buffer} Serialized bytes.
+
+##### `BlockEncoder.getDefaultCodecs()`
+
+Get built-in compression functions (currently `null` and `deflate`).
+
+
+#### Class `RawEncoder(schema, [opts])`
+
++ `schema` {Object|String|Type} Schema used for encoding. Argument parsing
+ logic is the same as for [`parse`](#parseschema-opts).
++ `opts` {Object} Encoding options. Available keys:
+ + `batchSize` {Number} To increase performance, records are serialized in
+ batches. Use this option to control how often batches are emitted. If it is
+ too small to fit a single record, it will be increased automatically.
+ Defaults to 64kB.
+
+The encoding equivalent of `RawDecoder`.
+
+##### Event `'data'`
+
++ `data` {Buffer} Serialized bytes.
+
+
+[canonical-schema]:
https://avro.apache.org/docs/current/spec.html#Parsing+Canonical+Form+for+Schemas
+[schema-resolution]:
https://avro.apache.org/docs/current/spec.html#Schema+Resolution
+[sort-order]: https://avro.apache.org/docs/current/spec.html#order
+[fingerprint]:
https://avro.apache.org/docs/current/spec.html#Schema+Fingerprints
+[custom-long]: Advanced-usage#custom-long-types
+[logical-types]: Advanced-usage#logical-types
Added: avro/trunk/lang/js/doc/Advanced-usage.md
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/doc/Advanced-usage.md?rev=1717850&view=auto
==============================================================================
--- avro/trunk/lang/js/doc/Advanced-usage.md (added)
+++ avro/trunk/lang/js/doc/Advanced-usage.md Thu Dec 3 21:35:44 2015
@@ -0,0 +1,359 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+
++ [Schema evolution](#schema-evolution)
++ [Logical types](#logical-types)
++ [Custom long types](#custom-long-types)
+
+
+## Schema evolution
+
+Schema evolution allows a type to deserialize binary data written by another
+[compatible][schema-resolution] type. This is done via
+[`createResolver`][create-resolver-api], and is particularly useful when we are
+only interested in a subset of the fields inside a record. By selectively
+decoding fields, we can significantly increase throughput.
+
+As a motivating example, consider the following event:
+
+```javascript
+var heavyType = avro.parse({
+ name: 'Event',
+ type: 'record',
+ fields: [
+ {name: 'time', type: 'long'},
+ {name: 'userId', type: 'int'},
+ {name: 'actions', type: {type: 'array', items: 'string'}},
+ ]
+});
+```
+
+Let's assume that we would like to compute statistics on users' actions but
+only for a few user IDs. One approach would be to decode the full record each
+time, but this is wasteful if very few users match our filter. We can do better
+by using the following reader's schema, and creating the corresponding
+resolver:
+
+```javascript
+var lightType = avro.parse({
+ name: 'LightEvent',
+ aliases: ['Event'],
+ type: 'record',
+ fields: [
+ {name: 'userId', type: 'int'},
+ ]
+});
+
+var resolver = lightType.createResolver(heavyType);
+```
+
+We decode only the `userId` field, and then, if the ID matches, process the
+full record. The function below implements this logic, returning a fully
+decoded record if the ID matches, and `undefined` otherwise.
+
+```javascript
+function fastDecode(buf) {
+ var lightRecord = lightType.fromBuffer(buf, resolver, true);
+ if (lightRecord.userId % 100 === 48) { // Arbitrary check.
+ return heavyType.fromBuffer(buf);
+ }
+}
+```
+
+In the above example, using randomly generated records, if the filter matches
+roughly 1% of the time, we are able to get a **400%** throughput increase
+compared to decoding the full record each time! The heavier the schema (and the
+closer to the beginning of the record the used fields are), the higher this
+increase will be.
+
+## Logical types
+
+The built-in types provided by Avro are sufficient for many use-cases, but it
+can often be much more convenient to work with native JavaScript objects. As a
+quick motivating example, let's imagine we have the following schema:
+
+```javascript
+var schema = {
+ name: 'Transaction',
+ type: 'record',
+ fields: [
+ {name: 'amount', type: 'int'},
+ {name: 'time', type: {type: 'long', logicalType: 'timestamp-millis'}}
+ ]
+};
+```
+
+The `time` field encodes a timestamp as a `long`, but it would be better if we
+could deserialize it directly into a native `Date` object. This is possible
+using Avro's *logical types*, with the following two steps:
+
++ Adding a `logicalType` attribute to the type's definition (e.g.
+ `'timestamp-millis'` above).
++ Implementing a corresponding [`LogicalType`][logical-type-api] and adding it
+ to [`parse`][parse-api]'s `logicalTypes`.
+
+Below is a sample implementation for a suitable `DateType` which will
+transparently deserialize/serialize native `Date` objects:
+
+```javascript
+var util = require('util');
+
+function DateType(attrs, opts) {
+ LogicalType.call(this, attrs, opts, [LongType]); // Require underlying
`long`.
+}
+util.inherits(DateType, LogicalType);
+
+DateType.prototype._fromValue = function (val) { return new Date(val); };
+DateType.prototype._toValue = function (date) { return +date; };
+```
+
+Usage is straightforward:
+
+```javascript
+var type = avro.parse(transactionSchema, {logicalTypes: {date: DateType}});
+
+// We create a new transaction.
+var transaction = {
+ amount: 32,
+ time: new Date('Thu Nov 05 2015 11:38:05 GMT-0800 (PST)')
+};
+
+// Our type is able to directly serialize it, including the date.
+var buf = type.toBuffer(transaction);
+
+// And we can get the date back just as easily.
+var date = type.fromBuffer(buf).time; // `Date` object.
+```
+
+Logical types can also be used with schema evolution. This is done by
+implementing an additional `_resolve` method. It should return a function which
+converts values of the writer's type into the logical type's values. For
+example, we can allow our `DateType` to read dates which were serialized as
+strings:
+
+```javascript
+DateType.prototype._resolve = function (type) {
+ if (
+ type instanceof StringType || // Support parsing strings.
+ type instanceof LongType ||
+ type instanceof DateType
+ ) {
+ return this._fromValue;
+ }
+};
+```
+
+And use it as follows:
+
+```javascript
+var stringType = avro.parse('string');
+var str = 'Thu Nov 05 2015 11:38:05 GMT-0800 (PST)';
+var buf = stringType.toBuffer(str);
+var resolver = dateType.createResolver(stringType);
+var date = dateType.fromBuffer(buf, resolver); // Date corresponding to `str`.
+```
+
+Finally, as a more fully featured example, we provide a sample implementation
+of the [decimal logical type][decimal-type] described in the spec:
+
+```javascript
+/**
+ * Sample decimal logical type implementation.
+ *
+ * It wraps its values in a very simple custom `Decimal` class.
+ *
+ */
+function DecimalType(attrs, opts) {
+ LogicalType.call(this, attrs, opts, [BytesType, FixedType]);
+
+ // Validate attributes.
+ var precision = attrs.precision;
+ if (precision !== (precision | 0) || precision <= 0) {
+ throw new Error('invalid precision');
+ }
+ var scale = attrs.scale;
+ if (scale !== (scale | 0) || scale < 0 || scale > precision) {
+ throw new Error('invalid scale');
+ }
+ var type = this.getUnderlyingType();
+ if (type instanceof FixedType) {
+ var size = type.getSize();
+ var maxPrecision = Math.log(Math.pow(2, 8 * size - 1) - 1) / Math.log(10);
+ if (precision > (maxPrecision | 0)) {
+ throw new Error('fixed size too small to hold required precision');
+ }
+ }
+
+ // A basic decimal class for this precision and scale.
+ function Decimal(unscaled) { this.unscaled = unscaled; }
+ Decimal.prototype.precision = precision;
+ Decimal.prototype.scale = scale;
+ Decimal.prototype.toNumber = function () {
+ return this.unscaled * Math.pow(10, -scale);
+ };
+
+ this.Decimal = Decimal;
+}
+util.inherits(DecimalType, LogicalType);
+
+DecimalType.prototype._fromValue = function (buf) {
+ return new this.Decimal(buf.readIntBE(0, buf.length));
+};
+
+DecimalType.prototype._toValue = function (dec) {
+ if (!(dec instanceof this.Decimal)) {
+ throw new Error('invalid decimal');
+ }
+
+ var type = this.getUnderlyingType();
+ var buf;
+ if (type instanceof FixedType) {
+ buf = new Buffer(type.getSize());
+ } else {
+ var size = Math.log(dec > 0 ? dec : - 2 * dec) / (Math.log(2) * 8) | 0;
+ buf = new Buffer(size + 1);
+ }
+ buf.writeIntBE(dec.unscaled, 0, buf.length);
+ return buf;
+};
+
+DecimalType.prototype._resolve = function (type) {
+ if (
+ type instanceof DecimalType &&
+ type.Decimal.prototype.precision === this.Decimal.prototype.precision &&
+ type.Decimal.prototype.scale === this.Decimal.prototype.scale
+ ) {
+ return function (dec) { return dec; };
+ }
+};
+```
+
+
+## Custom long types
+
+JavaScript represents all numbers as doubles internally, which means that it is
+possible to lose precision when using very large numbers (absolute value
+greater than `9e+15` or so). For example:
+
+```javascript
+Number.parseInt('9007199254740995') === 9007199254740996 // true
+```
+
+In most cases, these bounds are so large that this is not a problem (timestamps
+fit nicely inside the supported precision). However it might happen that the
+full range must be supported. (To avoid silently corrupting data, the default
+[`LongType`](Api#class-longtypeschema-opts) will throw an error when
+encountering a number outside the supported precision range.)
+
+There are multiple JavaScript libraries to represent 64-bit integers, with
+different characteristics (e.g. some are faster but do not run in the browser).
+Rather than tie us to any particular one, `avro` lets us choose the most
+adequate with [`LongType.using`](Api#longtypeusingmethods-nounpack). Below
+are a few sample implementations for popular libraries (refer to the API
+documentation for details on each option; a helper script is also available to
+validate our implementation inside `etc/scripts/`):
+
++ [`node-int64`](https://www.npmjs.com/package/node-int64):
+
+ ```javascript
+ var Long = require('node-int64');
+
+ var longType = avro.types.LongType.using({
+ fromBuffer: function (buf) { return new Long(buf); },
+ toBuffer: function (n) { return n.toBuffer(); },
+ fromJSON: function (obj) { return new Long(obj); },
+ toJSON: function (n) { return +n; },
+ isValid: function (n) { return n instanceof Long; },
+ compare: function (n1, n2) { return n1.compare(n2); }
+ });
+ ```
+
++ [`int64-native`](https://www.npmjs.com/package/int64-native):
+
+ ```javascript
+ var Long = require('int64-native');
+
+ var longType = avro.types.LongType.using({
+ fromBuffer: function (buf) { return new Long('0x' + buf.toString('hex'));
},
+ toBuffer: function (n) { return new Buffer(n.toString().slice(2), 'hex');
},
+ fromJSON: function (obj) { return new Long(obj); },
+ toJSON: function (n) { return +n; },
+ isValid: function (n) { return n instanceof Long; },
+ compare: function (n1, n2) { return n1.compare(n2); }
+ });
+ ```
+
++ [`long`](https://www.npmjs.com/package/long):
+
+ ```javascript
+ var Long = require('long');
+
+ var longType = avro.types.LongType.using({
+ fromBuffer: function (buf) {
+ return new Long(buf.readInt32LE(), buf.readInt32LE(4));
+ },
+ toBuffer: function (n) {
+ var buf = new Buffer(8);
+ buf.writeInt32LE(n.getLowBits());
+ buf.writeInt32LE(n.getHighBits(), 4);
+ return buf;
+ },
+ fromJSON: Long.fromValue,
+ toJSON: function (n) { return +n; },
+ isValid: Long.isLong,
+ compare: Long.compare
+ });
+ ```
+
+Any such implementation can then be used in place of the default `LongType` to
+provide full 64-bit support when decoding and encoding binary data. To do so,
+we override the default type used for `long`s by adding our implementation to
+the `registry` when parsing a schema:
+
+```javascript
+// Our schema here is very simple, but this would work for arbitrarily complex
+// ones (applying to all longs inside of it).
+var type = avro.parse('long', {registry: {'long': longType}});
+
+// Avro serialization of Number.MAX_SAFE_INTEGER + 4 (which is incorrectly
+// rounded when represented as a double):
+var buf = new Buffer([0x86, 0x80, 0x80, 0x80, 0x80, 0x80, 0x80, 0x20]);
+
+// Assuming we are using the `node-int64` implementation.
+var obj = new Long(buf);
+var encoded = type.toBuffer(obj); // == buf
+var decoded = type.fromBuffer(buf); // == obj (No precision loss.)
+```
+
+Because the built-in JSON parser is itself limited by JavaScript's internal
+number representation, using the `toString` and `fromString` methods is
+generally still unsafe (see `LongType.using`'s documentation for a possible
+workaround).
+
+Finally, to make integration easier, `toBuffer` and `fromBuffer` deal with
+already unpacked buffers by default. To leverage an external optimized packing
+and unpacking routine (for example when using a native C++ addon), we can
+disable this behavior by setting `LongType.using`'s `noUnpack` argument to
+`true`.
+
+[parse-api]: API#parseschema-opts
+[create-resolver-api]: API#typecreateresolverwritertype
+[logical-type-api]: API#class-logicaltypeattrs-opts-types
+[decimal-type]: https://avro.apache.org/docs/current/spec.html#Decimal
+[schema-resolution]:
https://avro.apache.org/docs/current/spec.html#Schema+Resolution
Added: avro/trunk/lang/js/doc/Home.md
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/doc/Home.md?rev=1717850&view=auto
==============================================================================
--- avro/trunk/lang/js/doc/Home.md (added)
+++ avro/trunk/lang/js/doc/Home.md Thu Dec 3 21:35:44 2015
@@ -0,0 +1,191 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+
+This page is meant to provide a brief overview of `avro`'s API:
+
++ [What is a `Type`?](#what-is-a-type)
++ [How do I get a `Type`?](#how-do-i-get-a-type)
++ [What about Avro files?](#what-about-avro-files)
++ [Next steps](#next-steps)
+
+
+## What is a `Type`?
+
+Each Avro type maps to a corresponding JavaScript [`Type`](API#class-type):
+
++ `int` maps to `IntType`.
++ `array`s map to `ArrayType`s.
++ `record`s map to `RecordType`s.
++ etc.
+
+An instance of a `Type` knows how to [`decode`](Api#typedecodebuf-pos-resolver)
+and [`encode`](Api#typeencodeval-buf-pos) and its corresponding objects. For
+example the `StringType` knows how to handle JavaScript strings:
+
+```javascript
+var stringType = new avro.types.StringType();
+var buf = stringType.toBuffer('Hi'); // Buffer containing 'Hi''s Avro encoding.
+var str = stringType.fromBuffer(buf); // === 'Hi'
+```
+
+The [`toBuffer`](API#typetobufferval) and
+[`fromBuffer`](API#typefrombufferval-resolver-nocheck) methods above are
+convenience functions which encode and decode a single object into/from a
+standalone buffer.
+
+Each `type` also provides other methods which can be useful. Here are a few
+(refer to the [API documentation](API#avro-types) for the full list):
+
++ JSON-encoding:
+
+ ```javascript
+ var jsonString = type.toString('Hi'); // === '"Hi"'
+ var str = type.fromString(jsonString); // === 'Hi'
+ ```
+
++ Validity checks:
+
+ ```javascript
+ var b1 = stringType.isValid('hello'); // === true ('hello' is a valid
string.)
+ var b2 = stringType.isValid(-2); // === false (-2 is not.)
+ ```
+
++ Random object generation:
+
+ ```javascript
+ var s = stringType.random(); // A random string.
+ ```
+
+
+## How do I get a `Type`?
+
+It is possible to instantiate types directly by calling their constructors
+(available in the `avro.types` namespace; this is what we used earlier), but in
+the vast majority of use-cases they will be automatically generated by parsing
+an existing schema.
+
+`avro` exposes a [`parse`](Api#parseschema-opts) method to do the
+heavy lifting:
+
+```javascript
+// Equivalent to what we did earlier.
+var stringType = avro.parse({type: 'string'});
+
+// A slightly more complex type.
+var mapType = avro.parse({type: 'map', values: 'long'});
+
+// The sky is the limit!
+var personType = avro.parse({
+ name: 'Person',
+ type: 'record',
+ fields: [
+ {name: 'name', type: 'string'},
+ {name: 'phone', type: ['null', 'string'], default: null},
+ {name: 'address', type: {
+ name: 'Address',
+ type: 'record',
+ fields: [
+ {name: 'city', type: 'string'},
+ {name: 'zip', type: 'int'}
+ ]
+ }}
+ ]
+});
+```
+
+Of course, all the `type` methods are available. For example:
+
+```javascript
+personType.isValid({
+ name: 'Ann',
+ phone: null,
+ address: {city: 'Cambridge', zip: 02139}
+}); // === true
+
+personType.isValid({
+ name: 'Bob',
+ phone: {string: '617-000-1234'},
+ address: {city: 'Boston'}
+}); // === false (Missing the zip code.)
+```
+
+Since schemas are often stored in separate files, passing a path to `parse`
+will attempt to load a JSON-serialized schema from there:
+
+```javascript
+var couponType = avro.parse('./Coupon.avsc');
+```
+
+For advanced use-cases, `parse` also has a few options which are detailed the
+API documentation.
+
+
+## What about Avro files?
+
+Avro files (meaning [Avro object container files][object-container]) hold
+serialized Avro records along with their schema. Reading them is as simple as
+calling [`createFileDecoder`](Api#createfiledecoderpath-opts):
+
+```javascript
+var personStream = avro.createFileDecoder('./persons.avro');
+```
+
+`personStream` is a [readable stream][rstream] of decoded records, which we can
+for example use as follows:
+
+```javascript
+personStream.on('data', function (person) {
+ if (person.address.city === 'San Francisco') {
+ doSomethingWith(person);
+ }
+});
+```
+
+In case we need the records' `type` or the file's codec, they are available by
+listening to the `'metadata'` event:
+
+```javascript
+personStream.on('metadata', function (type, codec) { /* Something useful. */
});
+```
+
+To access a file's header synchronously, there also exists an
+[`extractFileHeader`](Api#extractfileheaderpath-opts) method:
+
+```javascript
+var header = avro.extractFileHeader('persons.avro');
+```
+
+Writing to an Avro container file is possible using
+[`createFileEncoder`](Api#createfileencoderpath-type-opts):
+
+```javascript
+var encoder = avro.createFileEncoder('./processed.avro', type);
+```
+
+
+## Next steps
+
+The [API documentation](Api) provides a comprehensive list of available
+functions and their options. The [Advanced usage section](Advanced-usage) goes
+through a few examples to show how the API can be used.
+
+
+
+[object-container]:
https://avro.apache.org/docs/current/spec.html#Object+Container+Files
+[rstream]: https://nodejs.org/api/stream.html#stream_class_stream_readable
Added: avro/trunk/lang/js/etc/browser/avro.js
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/etc/browser/avro.js?rev=1717850&view=auto
==============================================================================
--- avro/trunk/lang/js/etc/browser/avro.js (added)
+++ avro/trunk/lang/js/etc/browser/avro.js Thu Dec 3 21:35:44 2015
@@ -0,0 +1,91 @@
+/* jshint browserify: true */
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+'use strict';
+
+/**
+ * Shim entry point used when `avro` is `require`d from browserify.
+ *
+ * It doesn't expose any of the filesystem methods and patches a few others.
+ *
+ */
+
+var Tap = require('../../lib/utils').Tap,
+ schemas = require('../../lib/schemas'),
+ deprecated = require('../deprecated/validator');
+
+
+function parse(schema, opts) {
+ var obj;
+ if (typeof schema == 'string') {
+ try {
+ obj = JSON.parse(schema);
+ } catch (err) {
+ // Pass. No file reading from the browser.
+ }
+ }
+ if (obj === undefined) {
+ obj = schema;
+ }
+ return schemas.createType(obj, opts);
+}
+
+// No utf8 and binary functions on browserify's `Buffer`, we must patch in the
+// generic slice and write equivalents.
+
+Tap.prototype.readString = function () {
+ var len = this.readLong();
+ var pos = this.pos;
+ var buf = this.buf;
+ this.pos += len;
+ if (this.pos > buf.length) {
+ return;
+ }
+ return this.buf.slice(pos, pos + len).toString();
+};
+
+Tap.prototype.writeString = function (s) {
+ var len = Buffer.byteLength(s);
+ this.writeLong(len);
+ var pos = this.pos;
+ this.pos += len;
+ if (this.pos > this.buf.length) {
+ return;
+ }
+ this.buf.write(s, pos);
+};
+
+Tap.prototype.writeBinary = function (s, len) {
+ var pos = this.pos;
+ this.pos += len;
+ if (this.pos > this.buf.length) {
+ return;
+ }
+ this.buf.write(s, pos, len, 'binary');
+};
+
+
+module.exports = {
+ parse: parse,
+ types: schemas.types,
+ Validator: deprecated.Validator,
+ ProtocolValidator: deprecated.ProtocolValidator
+};
Added: avro/trunk/lang/js/etc/browser/crypto.js
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/etc/browser/crypto.js?rev=1717850&view=auto
==============================================================================
--- avro/trunk/lang/js/etc/browser/crypto.js (added)
+++ avro/trunk/lang/js/etc/browser/crypto.js Thu Dec 3 21:35:44 2015
@@ -0,0 +1,36 @@
+/* jshint browserify: true */
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+'use strict';
+
+/**
+ * Shim to disable schema fingerprint computation.
+ *
+ */
+
+function createHash() {
+ throw new Error('fingerprinting not supported in the browser');
+}
+
+
+module.exports = {
+ createHash: createHash
+};
Copied: avro/trunk/lang/js/etc/deprecated/Gruntfile.js (from r1717830,
avro/trunk/lang/js/Gruntfile.js)
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/etc/deprecated/Gruntfile.js?p2=avro/trunk/lang/js/etc/deprecated/Gruntfile.js&p1=avro/trunk/lang/js/Gruntfile.js&r1=1717830&r2=1717850&rev=1717850&view=diff
==============================================================================
(empty)
Copied: avro/trunk/lang/js/etc/deprecated/README (from r1717830,
avro/trunk/lang/js/README)
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/etc/deprecated/README?p2=avro/trunk/lang/js/etc/deprecated/README&p1=avro/trunk/lang/js/README&r1=1717830&r2=1717850&rev=1717850&view=diff
==============================================================================
(empty)
Copied: avro/trunk/lang/js/etc/deprecated/test_validator.js (from r1717830,
avro/trunk/lang/js/test/validator.js)
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/etc/deprecated/test_validator.js?p2=avro/trunk/lang/js/etc/deprecated/test_validator.js&p1=avro/trunk/lang/js/test/validator.js&r1=1717830&r2=1717850&rev=1717850&view=diff
==============================================================================
--- avro/trunk/lang/js/test/validator.js (original)
+++ avro/trunk/lang/js/etc/deprecated/test_validator.js Thu Dec 3 21:35:44 2015
@@ -13,7 +13,7 @@
// See the License for the specific language governing permissions and
// limitations under the License.
-var validator = require('../lib/validator.js');
+var validator = require('./validator');
var Validator = validator.Validator;
var ProtocolValidator = validator.ProtocolValidator;
Copied: avro/trunk/lang/js/etc/deprecated/validator.js (from r1717830,
avro/trunk/lang/js/lib/validator.js)
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/etc/deprecated/validator.js?p2=avro/trunk/lang/js/etc/deprecated/validator.js&p1=avro/trunk/lang/js/lib/validator.js&r1=1717830&r2=1717850&rev=1717850&view=diff
==============================================================================
--- avro/trunk/lang/js/lib/validator.js (original)
+++ avro/trunk/lang/js/etc/deprecated/validator.js Thu Dec 3 21:35:44 2015
@@ -13,9 +13,12 @@
// See the License for the specific language governing permissions and
// limitations under the License.
-if (typeof require !== 'undefined') {
- var _ = require("underscore");
-}
+var _ = require("underscore"),
+ util = require('util');
+
+var WARNING = 'Validator API is deprecated. Please use the type API instead.';
+Validator = util.deprecate(Validator, WARNING);
+ProtocolValidator = util.deprecate(ProtocolValidator, WARNING);
var AvroSpec = {
PrimitiveTypes: ['null', 'boolean', 'int', 'long', 'float', 'double',
'bytes', 'string'],
@@ -409,7 +412,7 @@ function Validator(schema, namespace, na
Validator.validate = function(schema, obj) {
return (new Validator(schema)).validate(obj);
-};
+}
function ProtocolValidator(protocol) {
this.validate = function(typeName, obj) {
Added: avro/trunk/lang/js/lib/files.js
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/lib/files.js?rev=1717850&view=auto
==============================================================================
--- avro/trunk/lang/js/lib/files.js (added)
+++ avro/trunk/lang/js/lib/files.js Thu Dec 3 21:35:44 2015
@@ -0,0 +1,666 @@
+/* jshint node: true */
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+'use strict';
+
+var schemas = require('./schemas'),
+ utils = require('./utils'),
+ fs = require('fs'),
+ stream = require('stream'),
+ util = require('util'),
+ zlib = require('zlib');
+
+// Type of Avro header.
+var HEADER_TYPE = schemas.createType({
+ type: 'record',
+ name: 'org.apache.avro.file.Header',
+ fields : [
+ {name: 'magic', type: {type: 'fixed', name: 'Magic', size: 4}},
+ {name: 'meta', type: {type: 'map', values: 'bytes'}},
+ {name: 'sync', type: {type: 'fixed', name: 'Sync', size: 16}}
+ ]
+});
+
+// Type of each block.
+var BLOCK_TYPE = schemas.createType({
+ type: 'record',
+ name: 'org.apache.avro.file.Block',
+ fields : [
+ {name: 'count', type: 'long'},
+ {name: 'data', type: 'bytes'},
+ {name: 'sync', type: {type: 'fixed', name: 'Sync', size: 16}}
+ ]
+});
+
+// Used to toBuffer each block, without having to copy all its data.
+var LONG_TYPE = schemas.createType('long');
+
+// First 4 bytes of an Avro object container file.
+var MAGIC_BYTES = new Buffer('Obj\x01');
+
+// Convenience.
+var f = util.format;
+var Tap = utils.Tap;
+
+
+/**
+ * Parse a schema and return the corresponding type.
+ *
+ */
+function parse(schema, opts) {
+ return schemas.createType(loadSchema(schema), opts);
+}
+
+
+/**
+ * Duplex stream for decoding fragments.
+ *
+ */
+function RawDecoder(schema, opts) {
+ opts = opts || {};
+
+ var decode = opts.decode === undefined ? true : !!opts.decode;
+ stream.Duplex.call(this, {
+ readableObjectMode: decode,
+ allowHalfOpen: false
+ });
+ // Somehow setting this to false only closes the writable side after the
+ // readable side ends, while we need the other way. So we do it manually.
+
+ this._type = parse(schema);
+ this._tap = new Tap(new Buffer(0));
+ this._needPush = false;
+ this._readValue = createReader(decode, this._type);
+ this._finished = false;
+
+ this.on('finish', function () {
+ this._finished = true;
+ this._read();
+ });
+}
+util.inherits(RawDecoder, stream.Duplex);
+
+RawDecoder.prototype._write = function (chunk, encoding, cb) {
+ var tap = this._tap;
+ tap.buf = Buffer.concat([tap.buf.slice(tap.pos), chunk]);
+ tap.pos = 0;
+ if (this._needPush) {
+ this._needPush = false;
+ this._read();
+ }
+ cb();
+};
+
+RawDecoder.prototype._read = function () {
+ var tap = this._tap;
+ var pos = tap.pos;
+ var val = this._readValue(tap);
+ if (tap.isValid()) {
+ this.push(val);
+ } else if (!this._finished) {
+ tap.pos = pos;
+ this._needPush = true;
+ } else {
+ this.push(null);
+ }
+};
+
+
+/**
+ * Duplex stream for decoding object container files.
+ *
+ */
+function BlockDecoder(opts) {
+ opts = opts || {};
+
+ var decode = opts.decode === undefined ? true : !!opts.decode;
+ stream.Duplex.call(this, {
+ allowHalfOpen: true, // For async decompressors.
+ readableObjectMode: decode
+ });
+
+ this._type = null;
+ this._codecs = opts.codecs;
+ this._parseOpts = opts.parseOpts || {};
+ this._tap = new Tap(new Buffer(0));
+ this._blockTap = new Tap(new Buffer(0));
+ this._syncMarker = null;
+ this._readValue = null;
+ this._decode = decode;
+ this._queue = new utils.OrderedQueue();
+ this._decompress = null; // Decompression function.
+ this._index = 0; // Next block index.
+ this._pending = 0; // Number of blocks undergoing decompression.
+ this._needPush = false;
+ this._finished = false;
+
+ this.on('finish', function () {
+ this._finished = true;
+ if (!this._pending) {
+ this.push(null);
+ }
+ });
+}
+util.inherits(BlockDecoder, stream.Duplex);
+
+BlockDecoder.getDefaultCodecs = function () {
+ return {
+ 'null': function (buf, cb) { cb(null, buf); },
+ 'deflate': zlib.inflateRaw
+ };
+};
+
+BlockDecoder.prototype._decodeHeader = function () {
+ var tap = this._tap;
+ var header = HEADER_TYPE._read(tap);
+ if (!tap.isValid()) {
+ // Wait until more data arrives.
+ return false;
+ }
+
+ if (!MAGIC_BYTES.equals(header.magic)) {
+ this.emit('error', new Error('invalid magic bytes'));
+ return;
+ }
+
+ var codec = (header.meta['avro.codec'] || 'null').toString();
+ this._decompress = (this._codecs || BlockDecoder.getDefaultCodecs())[codec];
+ if (!this._decompress) {
+ this.emit('error', new Error(f('unknown codec: %s', codec)));
+ return;
+ }
+
+ try {
+ var schema = JSON.parse(header.meta['avro.schema'].toString());
+ this._type = parse(schema, this._parseOpts);
+ } catch (err) {
+ this.emit('error', err);
+ return;
+ }
+
+ this._readValue = createReader(this._decode, this._type);
+ this._syncMarker = header.sync;
+ this.emit('metadata', this._type, codec, header);
+ return true;
+};
+
+BlockDecoder.prototype._write = function (chunk, encoding, cb) {
+ var tap = this._tap;
+ tap.buf = Buffer.concat([tap.buf, chunk]);
+ tap.pos = 0;
+
+ if (!this._decodeHeader()) {
+ process.nextTick(cb);
+ return;
+ }
+
+ // We got the header, switch to block decoding mode. Also, call it directly
+ // in case we already have all the data (in which case `_write` wouldn't get
+ // called anymore).
+ this._write = this._writeChunk;
+ this._write(new Buffer(0), encoding, cb);
+};
+
+BlockDecoder.prototype._writeChunk = function (chunk, encoding, cb) {
+ var tap = this._tap;
+ tap.buf = Buffer.concat([tap.buf.slice(tap.pos), chunk]);
+ tap.pos = 0;
+
+ var block;
+ while ((block = tryReadBlock(tap))) {
+ if (!this._syncMarker.equals(block.sync)) {
+ cb(new Error('invalid sync marker'));
+ return;
+ }
+ this._decompress(block.data, this._createBlockCallback());
+ }
+
+ cb();
+};
+
+BlockDecoder.prototype._createBlockCallback = function () {
+ var self = this;
+ var index = this._index++;
+ this._pending++;
+
+ return function (err, data) {
+ if (err) {
+ self.emit('error', err);
+ return;
+ }
+ self._pending--;
+ self._queue.push(new BlockData(index, data));
+ if (self._needPush) {
+ self._needPush = false;
+ self._read();
+ }
+ };
+};
+
+BlockDecoder.prototype._read = function () {
+ var tap = this._blockTap;
+ if (tap.pos >= tap.buf.length) {
+ var data = this._queue.pop();
+ if (!data) {
+ if (this._finished && !this._pending) {
+ this.push(null);
+ } else {
+ this._needPush = true;
+ }
+ return; // Wait for more data.
+ }
+ tap.buf = data.buf;
+ tap.pos = 0;
+ }
+
+ this.push(this._readValue(tap)); // The read is guaranteed valid.
+};
+
+
+/**
+ * Duplex stream for encoding.
+ *
+ */
+function RawEncoder(schema, opts) {
+ opts = opts || {};
+
+ stream.Transform.call(this, {
+ writableObjectMode: true,
+ allowHalfOpen: false
+ });
+
+ this._type = parse(schema);
+ this._writeValue = function (tap, val) {
+ try {
+ this._type._write(tap, val);
+ } catch (err) {
+ this.emit('error', err);
+ }
+ };
+ this._tap = new Tap(new Buffer(opts.batchSize || 65536));
+}
+util.inherits(RawEncoder, stream.Transform);
+
+RawEncoder.prototype._transform = function (val, encoding, cb) {
+ var tap = this._tap;
+ var buf = tap.buf;
+ var pos = tap.pos;
+
+ this._writeValue(tap, val);
+ if (!tap.isValid()) {
+ if (pos) {
+ // Emit any valid data.
+ this.push(copyBuffer(tap.buf, 0, pos));
+ }
+ var len = tap.pos - pos;
+ if (len > buf.length) {
+ // Not enough space for last written object, need to resize.
+ tap.buf = new Buffer(2 * len);
+ }
+ tap.pos = 0;
+ this._writeValue(tap, val); // Rewrite last failed write.
+ }
+
+ cb();
+};
+
+RawEncoder.prototype._flush = function (cb) {
+ var tap = this._tap;
+ var pos = tap.pos;
+ if (pos) {
+ // This should only ever be false if nothing is written to the stream.
+ this.push(tap.buf.slice(0, pos));
+ }
+ cb();
+};
+
+
+/**
+ * Duplex stream to write object container files.
+ *
+ * @param schema
+ * @param opts {Object}
+ *
+ * + `blockSize`, uncompressed.
+ * + `codec`
+ * + `codecs`
+ * + `noCheck`
+ * + `omitHeader`, useful to append to an existing block file.
+ *
+ */
+function BlockEncoder(schema, opts) {
+ opts = opts || {};
+
+ stream.Duplex.call(this, {
+ allowHalfOpen: true, // To support async compressors.
+ writableObjectMode: true
+ });
+
+ var obj, type;
+ if (schema instanceof schemas.types.Type) {
+ type = schema;
+ schema = undefined;
+ } else {
+ // Keep full schema to be able to write it to the header later.
+ obj = loadSchema(schema);
+ type = schemas.createType(obj);
+ schema = JSON.stringify(obj);
+ }
+
+ this._schema = schema;
+ this._type = type;
+ this._writeValue = function (tap, val) {
+ try {
+ this._type._write(tap, val);
+ } catch (err) {
+ this.emit('error', err);
+ }
+ };
+ this._blockSize = opts.blockSize || 65536;
+ this._tap = new Tap(new Buffer(this._blockSize));
+ this._codecs = opts.codecs;
+ this._codec = opts.codec || 'null';
+ this._compress = null;
+ this._omitHeader = opts.omitHeader || false;
+ this._blockCount = 0;
+ this._syncMarker = opts.syncMarker || new utils.Lcg().nextBuffer(16);
+ this._queue = new utils.OrderedQueue();
+ this._pending = 0;
+ this._finished = false;
+ this._needPush = false;
+
+ this.on('finish', function () {
+ this._finished = true;
+ if (this._blockCount) {
+ this._flushChunk();
+ }
+ });
+}
+util.inherits(BlockEncoder, stream.Duplex);
+
+BlockEncoder.getDefaultCodecs = function () {
+ return {
+ 'null': function (buf, cb) { cb(null, buf); },
+ 'deflate': zlib.deflateRaw
+ };
+};
+
+BlockEncoder.prototype._write = function (val, encoding, cb) {
+ var codec = this._codec;
+ this._compress = (this._codecs || BlockEncoder.getDefaultCodecs())[codec];
+ if (!this._compress) {
+ this.emit('error', new Error(f('unsupported codec: %s', codec)));
+ return;
+ }
+
+ if (!this._omitHeader) {
+ var meta = {
+ 'avro.schema': new Buffer(this._schema || this._type.getSchema()),
+ 'avro.codec': new Buffer(this._codec)
+ };
+ var Header = HEADER_TYPE.getRecordConstructor();
+ var header = new Header(MAGIC_BYTES, meta, this._syncMarker);
+ this.push(header.$toBuffer());
+ }
+
+ this._write = this._writeChunk;
+ this._write(val, encoding, cb);
+};
+
+BlockEncoder.prototype._writeChunk = function (val, encoding, cb) {
+ var tap = this._tap;
+ var pos = tap.pos;
+
+ this._writeValue(tap, val);
+ if (!tap.isValid()) {
+ if (pos) {
+ this._flushChunk(pos);
+ }
+ var len = tap.pos - pos;
+ if (len > this._blockSize) {
+ // Not enough space for last written object, need to resize.
+ this._blockSize = len * 2;
+ }
+ tap.buf = new Buffer(this._blockSize);
+ tap.pos = 0;
+ this._writeValue(tap, val); // Rewrite last failed write.
+ }
+ this._blockCount++;
+
+ cb();
+};
+
+BlockEncoder.prototype._flushChunk = function (pos) {
+ var tap = this._tap;
+ pos = pos || tap.pos;
+ this._compress(tap.buf.slice(0, pos), this._createBlockCallback());
+ this._blockCount = 0;
+};
+
+BlockEncoder.prototype._read = function () {
+ var self = this;
+ var data = this._queue.pop();
+ if (!data) {
+ if (this._finished && !this._pending) {
+ process.nextTick(function () { self.push(null); });
+ } else {
+ this._needPush = true;
+ }
+ return;
+ }
+
+ this.push(LONG_TYPE.toBuffer(data.count, true));
+ this.push(LONG_TYPE.toBuffer(data.buf.length, true));
+ this.push(data.buf);
+ this.push(this._syncMarker);
+};
+
+BlockEncoder.prototype._createBlockCallback = function () {
+ var self = this;
+ var index = this._index++;
+ var count = this._blockCount;
+ this._pending++;
+
+ return function (err, data) {
+ if (err) {
+ self.emit('error', err);
+ return;
+ }
+ self._pending--;
+ self._queue.push(new BlockData(index, data, count));
+ if (self._needPush) {
+ self._needPush = false;
+ self._read();
+ }
+ };
+};
+
+
+/**
+ * Extract a container file's header synchronously.
+ *
+ */
+function extractFileHeader(path, opts) {
+ opts = opts || {};
+
+ var decode = opts.decode === undefined ? true : !!opts.decode;
+ var size = Math.max(opts.size || 4096, 4);
+ var fd = fs.openSync(path, 'r');
+ var buf = new Buffer(size);
+ var pos = 0;
+ var tap = new Tap(buf);
+ var header = null;
+
+ while (pos < 4) {
+ // Make sure we have enough to check the magic bytes.
+ pos += fs.readSync(fd, buf, pos, size - pos);
+ }
+ if (MAGIC_BYTES.equals(buf.slice(0, 4))) {
+ do {
+ header = HEADER_TYPE._read(tap);
+ } while (!isValid());
+ if (decode !== false) {
+ var meta = header.meta;
+ meta['avro.schema'] = JSON.parse(meta['avro.schema'].toString());
+ if (meta['avro.codec'] !== undefined) {
+ meta['avro.codec'] = meta['avro.codec'].toString();
+ }
+ }
+ }
+ fs.closeSync(fd);
+ return header;
+
+ function isValid() {
+ if (tap.isValid()) {
+ return true;
+ }
+ var len = 2 * tap.buf.length;
+ var buf = new Buffer(len);
+ len = fs.readSync(fd, buf, 0, len);
+ tap.buf = Buffer.concat([tap.buf, buf]);
+ tap.pos = 0;
+ return false;
+ }
+}
+
+
+/**
+ * Readable stream of records from a local Avro file.
+ *
+ */
+function createFileDecoder(path, opts) {
+ return fs.createReadStream(path).pipe(new BlockDecoder(opts));
+}
+
+
+/**
+ * Writable stream of records to a local Avro file.
+ *
+ */
+function createFileEncoder(path, schema, opts) {
+ var encoder = new BlockEncoder(schema, opts);
+ encoder.pipe(fs.createWriteStream(path, {defaultEncoding: 'binary'}));
+ return encoder;
+}
+
+
+// Helpers.
+
+/**
+ * An indexed block.
+ *
+ * This can be used to preserve block order since compression and decompression
+ * can cause some some blocks to be returned out of order. The count is only
+ * used when encoding.
+ *
+ */
+function BlockData(index, buf, count) {
+ this.index = index;
+ this.buf = buf;
+ this.count = count | 0;
+}
+
+/**
+ * Maybe get a block.
+ *
+ */
+function tryReadBlock(tap) {
+ var pos = tap.pos;
+ var block = BLOCK_TYPE._read(tap);
+ if (!tap.isValid()) {
+ tap.pos = pos;
+ return null;
+ }
+ return block;
+}
+
+/**
+ * Create bytes consumer, either reading or skipping records.
+ *
+ */
+function createReader(decode, type) {
+ if (decode) {
+ return function (tap) { return type._read(tap); };
+ } else {
+ return (function (skipper) {
+ return function (tap) {
+ var pos = tap.pos;
+ skipper(tap);
+ return tap.buf.slice(pos, tap.pos);
+ };
+ })(type._skip);
+ }
+}
+
+/**
+ * Copy a buffer.
+ *
+ * This avoids having to create a slice of the original buffer.
+ *
+ */
+function copyBuffer(buf, pos, len) {
+ var copy = new Buffer(len);
+ buf.copy(copy, 0, pos, pos + len);
+ return copy;
+}
+
+/**
+ * Try to load a schema.
+ *
+ * This method will attempt to load schemas from a file if the schema passed is
+ * a string which isn't valid JSON and contains at least one slash.
+ *
+ */
+function loadSchema(schema) {
+ var obj;
+ if (typeof schema == 'string') {
+ try {
+ obj = JSON.parse(schema);
+ } catch (err) {
+ if (~schema.indexOf('/')) {
+ // This can't be a valid name, so we interpret is as a filepath. This
+ // makes is always feasible to read a file, independent of its name
+ // (i.e. even if its name is valid JSON), by prefixing it with `./`.
+ obj = JSON.parse(fs.readFileSync(schema));
+ }
+ }
+ }
+ if (obj === undefined) {
+ obj = schema;
+ }
+ return obj;
+}
+
+
+module.exports = {
+ HEADER_TYPE: HEADER_TYPE, // For tests.
+ MAGIC_BYTES: MAGIC_BYTES, // Idem.
+ parse: parse,
+ createFileDecoder: createFileDecoder,
+ createFileEncoder: createFileEncoder,
+ extractFileHeader: extractFileHeader,
+ streams: {
+ RawDecoder: RawDecoder,
+ BlockDecoder: BlockDecoder,
+ RawEncoder: RawEncoder,
+ BlockEncoder: BlockEncoder
+ }
+};
Added: avro/trunk/lang/js/lib/index.js
URL:
http://svn.apache.org/viewvc/avro/trunk/lang/js/lib/index.js?rev=1717850&view=auto
==============================================================================
--- avro/trunk/lang/js/lib/index.js (added)
+++ avro/trunk/lang/js/lib/index.js Thu Dec 3 21:35:44 2015
@@ -0,0 +1,45 @@
+/* jshint node: true */
+
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+'use strict';
+
+/**
+ * Main node.js entry point.
+ *
+ * See `etc/browser/avro.js` for the entry point used for browserify.
+ *
+ */
+
+var files = require('./files'),
+ schemas = require('./schemas'),
+ deprecated = require('../etc/deprecated/validator');
+
+
+module.exports = {
+ parse: files.parse,
+ createFileDecoder: files.createFileDecoder,
+ createFileEncoder: files.createFileEncoder,
+ extractFileHeader: files.extractFileHeader,
+ streams: files.streams,
+ types: schemas.types,
+ Validator: deprecated.Validator,
+ ProtocolValidator: deprecated.ProtocolValidator
+};