[jira] [Created] (NIFI-6057) For InvokeHTTP processor, allow request body to come from an EL expression

2019-02-19 Thread Alex Savitsky (JIRA)
Alex Savitsky created NIFI-6057:
---

 Summary: For InvokeHTTP processor, allow request body to come from 
an EL expression
 Key: NIFI-6057
 URL: https://issues.apache.org/jira/browse/NIFI-6057
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Alex Savitsky
Assignee: Alex Savitsky


Currently, there's either no body, or the body is taken from the flowfile. This 
Jira adds the ability to specify an EL expression for the request body instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-6011) Support schema versions in ConfluentSchemaRegistry service

2019-02-08 Thread Alex Savitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Savitsky updated NIFI-6011:

Description: ConfluentSchemaRegistry service only supports fetching the 
latest schema version by name/id, but the underlying schema registry REST 
service supports fetching schemas by the name-version pairs as well. NiFi 
schema registry already has all the necessary plumbing to pass the version 
along, so it makes sense to implement that option in ConfluentSchemaRegistry.  
(was: ConfluentSchemaRegistry service only supports fetching the latest schema 
version by name/id, but the underlying schema registry REST service supports 
fetching schemas by the name-version pairs as well - and NiFi schema registry 
already has all the necessary plumbing as well.)

> Support schema versions in ConfluentSchemaRegistry service
> --
>
> Key: NIFI-6011
> URL: https://issues.apache.org/jira/browse/NIFI-6011
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Alex Savitsky
>Assignee: Alex Savitsky
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ConfluentSchemaRegistry service only supports fetching the latest schema 
> version by name/id, but the underlying schema registry REST service supports 
> fetching schemas by the name-version pairs as well. NiFi schema registry 
> already has all the necessary plumbing to pass the version along, so it makes 
> sense to implement that option in ConfluentSchemaRegistry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-6011) Support schema versions in ConfluentSchemaRegistry service

2019-02-08 Thread Alex Savitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Savitsky reassigned NIFI-6011:
---

Assignee: Alex Savitsky

> Support schema versions in ConfluentSchemaRegistry service
> --
>
> Key: NIFI-6011
> URL: https://issues.apache.org/jira/browse/NIFI-6011
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Alex Savitsky
>Assignee: Alex Savitsky
>Priority: Major
>
> ConfluentSchemaRegistry service only supports fetching the latest schema 
> version by name/id, but the underlying schema registry REST service supports 
> fetching schemas by the name-version pairs as well - and NiFi schema registry 
> already has all the necessary plumbing as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-6011) Support schema versions in ConfluentSchemaRegistry service

2019-02-08 Thread Alex Savitsky (JIRA)
Alex Savitsky created NIFI-6011:
---

 Summary: Support schema versions in ConfluentSchemaRegistry service
 Key: NIFI-6011
 URL: https://issues.apache.org/jira/browse/NIFI-6011
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Affects Versions: 1.8.0
Reporter: Alex Savitsky


ConfluentSchemaRegistry service only supports fetching the latest schema 
version by name/id, but the underlying schema registry REST service supports 
fetching schemas by the name-version pairs as well - and NiFi schema registry 
already has all the necessary plumbing as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5960) Wrong sub-schema picked for CHOICE datatype

2019-01-17 Thread Alex Savitsky (JIRA)
Alex Savitsky created NIFI-5960:
---

 Summary: Wrong sub-schema picked for CHOICE datatype
 Key: NIFI-5960
 URL: https://issues.apache.org/jira/browse/NIFI-5960
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework
Affects Versions: 1.8.0
Reporter: Alex Savitsky


When CHOICE datatype contains multiple RECORD choices, if any of the RECORD 
schemas have all nullable fields, these schemas will be considered compatible 
with any input, and can be picked as a valid choice instead of a more 
appropriate sub-schema. The following unit test showcases the issue:
{code:java}
package org.apache.nifi.serialization.record;

import org.apache.nifi.serialization.SimpleRecordSchema;
import org.apache.nifi.serialization.record.type.ChoiceDataType;
import org.apache.nifi.serialization.record.type.RecordDataType;
import org.apache.nifi.serialization.record.util.DataTypeUtils;
import org.junit.Test;

import static java.util.Arrays.asList;
import static java.util.Collections.*;
import static org.apache.nifi.serialization.record.RecordFieldType.STRING;
import static org.junit.Assert.*;

public class DataTypeUtilsTest {
@Test
public void testChoiceCompatibility() {
DataType choice1 = new RecordDataType(new 
SimpleRecordSchema(singletonList(new RecordField("field1", 
STRING.getDataType(), false;
DataType choice2 = new RecordDataType(new 
SimpleRecordSchema(singletonList(new RecordField("field2", 
STRING.getDataType();
Record record = new MapRecord(new 
SimpleRecordSchema(emptyList()), singletonMap("field1", "value1"));
DataType dataType = DataTypeUtils.chooseDataType(record, new 
ChoiceDataType(asList(choice2, choice1)));
assertEquals(dataType, choice1);
assertNotEquals(dataType, choice2);
}
}
{code}
When presented with an input containing a single field "field1", and a choice 
of two schemas , one with all nullable unrelated fields ("field2"), and another 
with fields matching the input ("field1"), the chooseDataType() call will 
choose the unrelated schema if it's presented first.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-5943) Enhance Avro conversions

2019-01-14 Thread Alex Savitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739396#comment-16739396
 ] 

Alex Savitsky edited comment on NIFI-5943 at 1/14/19 1:52 PM:
--

[~markap14] all the information to do the conversions is there, but the class 
currently doesn't do these conversions. This ticket is to add the conversions. 
BTW there's a PR open for it already 
([https://github.com/apache/nifi/pull/3267)|https://github.com/apache/nifi/pull/3267]
 - usually Jira auto-links it once it's found


was (Author: alex_savitsky):
[~markap14] all the information to do the conversions is there, but the class 
currently doesn't do these conversions. This ticket is to add the conversions. 
BTW there's a PR open for it already 
([https://github.com/apache/nifi/pull/3258)] - usually Jira auto-links it once 
it's found

> Enhance Avro conversions
> 
>
> Key: NIFI-5943
> URL: https://issues.apache.org/jira/browse/NIFI-5943
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Alex Savitsky
>Assignee: Alex Savitsky
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> AvroTypeUtil has all the necessary information to support conversions of List 
> objects to ARRAY (currently only Java array is supported) as well as Map 
> objects to RECORD (currently only NiFi Record is supported)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5947) Elasticsearch lookup service that can work with LookupAttribute

2019-01-10 Thread Alex Savitsky (JIRA)
Alex Savitsky created NIFI-5947:
---

 Summary: Elasticsearch lookup service that can work with 
LookupAttribute
 Key: NIFI-5947
 URL: https://issues.apache.org/jira/browse/NIFI-5947
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Affects Versions: 1.8.0
Reporter: Alex Savitsky


Create an Elasticsearch-backed lookup service that can be used in a 
LookupAttribute processor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5943) Enhance Avro conversions

2019-01-10 Thread Alex Savitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739396#comment-16739396
 ] 

Alex Savitsky commented on NIFI-5943:
-

[~markap14] all the information to do the conversions is there, but the class 
currently doesn't do these conversions. This ticket is to add the conversions. 
BTW there's a PR open for it already 
([https://github.com/apache/nifi/pull/3258)] - usually Jira auto-links it once 
it's found

> Enhance Avro conversions
> 
>
> Key: NIFI-5943
> URL: https://issues.apache.org/jira/browse/NIFI-5943
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Alex Savitsky
>Assignee: Alex Savitsky
>Priority: Minor
>
> AvroTypeUtil has all the necessary information to support conversions of List 
> objects to ARRAY (currently only Java array is supported) as well as Map 
> objects to RECORD (currently only NiFi Record is supported)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5943) Enhance Avro conversions

2019-01-09 Thread Alex Savitsky (JIRA)
Alex Savitsky created NIFI-5943:
---

 Summary: Enhance Avro conversions
 Key: NIFI-5943
 URL: https://issues.apache.org/jira/browse/NIFI-5943
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Alex Savitsky
Assignee: Alex Savitsky


AvroTypeUtil has all the necessary information to support conversions of List 
objects to ARRAY (currently only Java array is supported) as well as Map 
objects to RECORD (currently only NiFi Record is supported)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-5909) PutElasticsearchHttpRecord doesn't allow to customize the timestamp format

2019-01-09 Thread Alex Savitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Savitsky reassigned NIFI-5909:
---

Assignee: Alex Savitsky

> PutElasticsearchHttpRecord doesn't allow to customize the timestamp format
> --
>
> Key: NIFI-5909
> URL: https://issues.apache.org/jira/browse/NIFI-5909
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Alex Savitsky
>Assignee: Alex Savitsky
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> All timestamps are sent to Elasticsearch in the "-MM-dd HH:mm:ss" format, 
> coming from the RecordFieldType.TIMESTAMP.getDefaultFormat(). There's plenty 
> of use cases that call for Elasticsearch data to be presented differently, 
> and the format should be customizable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-5937) PutElasticsearchHttpRecord uses system default encoding

2019-01-08 Thread Alex Savitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Savitsky reassigned NIFI-5937:
---

Assignee: Alex Savitsky

> PutElasticsearchHttpRecord uses system default encoding
> ---
>
> Key: NIFI-5937
> URL: https://issues.apache.org/jira/browse/NIFI-5937
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Alex Savitsky
>Assignee: Alex Savitsky
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> PutElasticsearchHttpRecord line 348:
> {code:java}
> json.append(out.toString());
> {code}
> This results in the conversion being done using system default encoding, 
> possibly garbling non-ASCII characters in the output. Should use the encoding 
> configured in the processor in the toString call.
> As a workaround, the "file.encoding" system property can be specified 
> explicitly in the bootstrap.conf:
> {code:java}
> java.arg.7=-Dfile.encoding=UTF-8{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5937) PutElasticsearchHttpRecord uses system default encoding

2019-01-08 Thread Alex Savitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Savitsky updated NIFI-5937:

Description: 
PutElasticsearchHttpRecord line 348:
{code:java}
json.append(out.toString());
{code}
This results in the conversion being done using system default encoding, 
possibly garbling non-ASCII characters in the output. Should use the encoding 
configured in the processor in the toString call.

As a workaround, the "file.encoding" system property can be specified 
explicitly in the bootstrap.conf:
{code:java}
java.arg.7=-Dfile.encoding=UTF-8{code}

  was:
PutElasticsearchHttpRecord line 348:
{code:java}
json.append(out.toString());
{code}
Should use the encoding configured in the processor in the toString call.


> PutElasticsearchHttpRecord uses system default encoding
> ---
>
> Key: NIFI-5937
> URL: https://issues.apache.org/jira/browse/NIFI-5937
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Alex Savitsky
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> PutElasticsearchHttpRecord line 348:
> {code:java}
> json.append(out.toString());
> {code}
> This results in the conversion being done using system default encoding, 
> possibly garbling non-ASCII characters in the output. Should use the encoding 
> configured in the processor in the toString call.
> As a workaround, the "file.encoding" system property can be specified 
> explicitly in the bootstrap.conf:
> {code:java}
> java.arg.7=-Dfile.encoding=UTF-8{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5937) PutElasticsearchHttpRecord uses system default encoding

2019-01-08 Thread Alex Savitsky (JIRA)
Alex Savitsky created NIFI-5937:
---

 Summary: PutElasticsearchHttpRecord uses system default encoding
 Key: NIFI-5937
 URL: https://issues.apache.org/jira/browse/NIFI-5937
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Affects Versions: 1.8.0
Reporter: Alex Savitsky


PutElasticsearchHttpRecord line 348:
{code:java}
json.append(out.toString());
{code}
Should use the encoding configured in the processor in the toString call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5909) PutElasticsearchHttpRecord doesn't allow to customize the timestamp format

2018-12-19 Thread Alex Savitsky (JIRA)
Alex Savitsky created NIFI-5909:
---

 Summary: PutElasticsearchHttpRecord doesn't allow to customize the 
timestamp format
 Key: NIFI-5909
 URL: https://issues.apache.org/jira/browse/NIFI-5909
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Affects Versions: 1.8.0
Reporter: Alex Savitsky


All timestamps are sent to Elasticsearch in the "-MM-dd HH:mm:ss" format, 
coming from the RecordFieldType.TIMESTAMP.getDefaultFormat(). There's plenty of 
use cases that call for Elasticsearch data to be presented differently, and the 
format should be customizable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5871) MockProcessSession.putAllAttributes should ignore the UUID attribute

2018-12-05 Thread Alex Savitsky (JIRA)
Alex Savitsky created NIFI-5871:
---

 Summary: MockProcessSession.putAllAttributes should ignore the 
UUID attribute
 Key: NIFI-5871
 URL: https://issues.apache.org/jira/browse/NIFI-5871
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework
Affects Versions: 1.8.0
Reporter: Alex Savitsky


Currently, the method would copy all attributes indiscriminately, but the 
interface Javadoc specifically states that the attribute "uuid" should be 
ignored. This leads to issues with testing, where two distinct flow files are 
considered same in the session.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (NIFI-5731) Support List-typed fields for Array coercion

2018-11-28 Thread Alex Savitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Savitsky resolved NIFI-5731.
-
Resolution: Duplicate

> Support List-typed fields for Array coercion
> 
>
> Key: NIFI-5731
> URL: https://issues.apache.org/jira/browse/NIFI-5731
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.7.1
>Reporter: Alex Savitsky
>Priority: Trivial
>
> This Jira concerns the nifi-record module specifically.
> Currently, the toArray method of 
> org.apache.nifi.serialization.record.util.DataTypeUtils does not support 
> values of type java.util.List to convert to Object[], despite the conversion 
> being a trivial one-liner:
> {code:java}
> return ((List) value).toArray();
> {code}
> The Record being converted doesn't always have the control to enforce all its 
> arrays being actual arrays and not collections (e.g., if the Record was 
> created in Groovy), and it would be nice to have it converted on the fly, 
> rather than having to transform it manually.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5843) Unclear validation message for ScriptingComponentHelper

2018-11-28 Thread Alex Savitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702000#comment-16702000
 ] 

Alex Savitsky commented on NIFI-5843:
-

[~ijokarumawak] yes, your way is probably even better. 
[https://github.com/apache/nifi/pull/3186] have been created and is available 
for your review.

> Unclear validation message for ScriptingComponentHelper
> ---
>
> Key: NIFI-5843
> URL: https://issues.apache.org/jira/browse/NIFI-5843
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.8.0
>Reporter: Alex Savitsky
>Priority: Trivial
>
> ScriptingComponentHelper.customValidate (from nifi-scripting-processors) 
> creates an unclear validation message
> {quote}'' is invalid because Exactly one of Script File or Script Body must 
> be set
> {quote}
> as it doesn't specify the validation subject. Since there are technically two 
> invalid subjects in this case (Script File and Script Body), I suggest adding 
> two validation messages, one for each subject.
> Current code:
> {code:java}
> results.add(new ValidationResult.Builder().valid(false).explanation(
> "Exactly one of Script File or Script Body must be set").build());
> {code}
> Proposed fix:
> {code:java}
> results.add(new ValidationResult.Builder().subject("Script 
> Body").valid(false).explanation(
> "exactly one of Script File or Script Body must be set").build());
> results.add(new ValidationResult.Builder().subject("Script 
> File").valid(false).explanation(
> "exactly one of Script File or Script Body must be set").build());
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5843) Unclear validation message for ScriptingComponentHelper

2018-11-27 Thread Alex Savitsky (JIRA)
Alex Savitsky created NIFI-5843:
---

 Summary: Unclear validation message for ScriptingComponentHelper
 Key: NIFI-5843
 URL: https://issues.apache.org/jira/browse/NIFI-5843
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core Framework
Affects Versions: 1.8.0
Reporter: Alex Savitsky


ScriptingComponentHelper.customValidate (from nifi-scripting-processors) 
creates an unclear validation message
{quote}'' is invalid because Exactly one of Script File or Script Body must be 
set
{quote}
as it doesn't specify the validation subject. Since there are technically two 
invalid subjects in this case (Script File and Script Body), I suggest adding 
two validation messages, one for each subject.

Current code:
{code:java}
results.add(new ValidationResult.Builder().valid(false).explanation(
"Exactly one of Script File or Script Body must be set").build());
{code}
Proposed fix:
{code:java}
results.add(new ValidationResult.Builder().subject("Script 
Body").valid(false).explanation(
"exactly one of Script File or Script Body must be set").build());
results.add(new ValidationResult.Builder().subject("Script 
File").valid(false).explanation(
"exactly one of Script File or Script Body must be set").build());
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-5735) Record-oriented processors/services do not properly support Avro Unions

2018-11-06 Thread Alex Savitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675735#comment-16675735
 ] 

Alex Savitsky edited comment on NIFI-5735 at 11/6/18 1:36 PM:
--

Attached is a patch against the master NiFi branch that fixes the issue.

General idea: convertToAvroObject now returns a pair of the original conversion 
result and the number of fields that failed the conversion for the underlying 
record type, if any (0 otherwise).

The only place where the second pair element is used, is in the lambda passed 
to convertUnionFieldValue.

Instead of simply returning the converted Avro object, the lambda now inspects 
the number of failed fields, throwing an exception if this number is not zero.

This signals the schema conversion error to the caller, allowing 
convertUnionFieldValue to continue iterating union schemas, until one is found 
that has all the fields recognized.

[^NIFI-5735.patch]


was (Author: alex_savitsky):
Attached is a patch against the master NiFi branch that fixes the issue. 
General idea: convertToAvroObject now returns a pair of the original conversion 
result and the number of fields that failed the conversion for the underlying 
record type, if any (0 otherwise). The only place where the second pair element 
is used, is in the lambda passed to convertUnionFieldValue. Instead of simply 
returning the converted Avro object, the lambda now inspects the number of 
failed fields, throwing an exception if this number is not zero. This signals 
the schema conversion error to the caller, allowing convertUnionFieldValue to 
continue iterating union schemas, until one is found that has all the fields 
recognized.

[^NIFI-5735.patch]

> Record-oriented processors/services do not properly support Avro Unions
> ---
>
> Key: NIFI-5735
> URL: https://issues.apache.org/jira/browse/NIFI-5735
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework, Extensions
>Affects Versions: 1.7.1
>Reporter: Daniel Solow
>Priority: Major
>  Labels: AVRO, avro
> Attachments: 
> 0001-NIFI-5735-added-preliminary-support-for-union-resolu.patch, 
> NIFI-5735.patch
>
>
> The [Avro spec|https://avro.apache.org/docs/1.8.2/spec.html#Unions] states:
> {quote}Unions may not contain more than one schema with the same type, 
> *except for the named types* record, fixed and enum. For example, unions 
> containing two array types or two map types are not permitted, but two types 
> with different names are permitted. (Names permit efficient resolution when 
> reading and writing unions.)
> {quote}
> However record oriented processors/services in Nifi do not support multiple 
> named types per union. This is a problem, for example, with the following 
> schema:
> {code:javascript}
> {
> "type": "record",
> "name": "root",
> "fields": [
> {
> "name": "children",
> "type": {
> "type": "array",
> "items": [
> {
> "type": "record",
> "name": "left",
> "fields": [
> {
> "name": "f1",
> "type": "string"
> }
> ]
> },
> {
> "type": "record",
> "name": "right",
> "fields": [
> {
> "name": "f2",
> "type": "int"
> }
> ]
> }
> ]
> }
> }
> ]
> }
> {code}
>  This schema contains a field name "children" which is array of type union. 
> The union type contains two possible record types. Currently the Nifi avro 
> utilities will fail to process records of this schema with "children" arrays 
> that contain both "left" and "right" record types.
> I've traced this bug to the [AvroTypeUtils 
> class|https://github.com/apache/nifi/blob/rel/nifi-1.7.1/nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-avro-record-utils/src/main/java/org/apache/nifi/avro/AvroTypeUtil.java].
> Specifically there are bugs in the convertUnionFieldValue method and in the 
> buildAvroSchema method. Both of these methods make the assumption that an 
> Avro union can only contain one child type of each type. As stated in the 
> spec, this is true for primitive types and non-named complex types but not 
> for named types.
>  There may be related bugs elsewhere, but I haven't been able to locate them 
> yet.
>  
>  



--
This message was sent by 

[jira] [Commented] (NIFI-5735) Record-oriented processors/services do not properly support Avro Unions

2018-11-05 Thread Alex Savitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675735#comment-16675735
 ] 

Alex Savitsky commented on NIFI-5735:
-

Attached is a patch against the master NiFi branch that fixes the issue. 
General idea: convertToAvroObject now returns a pair of the original conversion 
result and the number of fields that failed the conversion for the underlying 
record type, if any (0 otherwise). The only place where the second pair element 
is used, is in the lambda passed to convertUnionFieldValue. Instead of simply 
returning the converted Avro object, the lambda now inspects the number of 
failed fields, throwing an exception if this number is not zero. This signals 
the schema conversion error to the caller, allowing convertUnionFieldValue to 
continue iterating union schemas, until one is found that has all the fields 
recognized.

[^NIFI-5735.patch]

> Record-oriented processors/services do not properly support Avro Unions
> ---
>
> Key: NIFI-5735
> URL: https://issues.apache.org/jira/browse/NIFI-5735
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework, Extensions
>Affects Versions: 1.7.1
>Reporter: Daniel Solow
>Priority: Major
>  Labels: AVRO, avro
> Attachments: 
> 0001-NIFI-5735-added-preliminary-support-for-union-resolu.patch, 
> NIFI-5735.patch
>
>
> The [Avro spec|https://avro.apache.org/docs/1.8.2/spec.html#Unions] states:
> {quote}Unions may not contain more than one schema with the same type, 
> *except for the named types* record, fixed and enum. For example, unions 
> containing two array types or two map types are not permitted, but two types 
> with different names are permitted. (Names permit efficient resolution when 
> reading and writing unions.)
> {quote}
> However record oriented processors/services in Nifi do not support multiple 
> named types per union. This is a problem, for example, with the following 
> schema:
> {code:javascript}
> {
> "type": "record",
> "name": "root",
> "fields": [
> {
> "name": "children",
> "type": {
> "type": "array",
> "items": [
> {
> "type": "record",
> "name": "left",
> "fields": [
> {
> "name": "f1",
> "type": "string"
> }
> ]
> },
> {
> "type": "record",
> "name": "right",
> "fields": [
> {
> "name": "f2",
> "type": "int"
> }
> ]
> }
> ]
> }
> }
> ]
> }
> {code}
>  This schema contains a field name "children" which is array of type union. 
> The union type contains two possible record types. Currently the Nifi avro 
> utilities will fail to process records of this schema with "children" arrays 
> that contain both "left" and "right" record types.
> I've traced this bug to the [AvroTypeUtils 
> class|https://github.com/apache/nifi/blob/rel/nifi-1.7.1/nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-avro-record-utils/src/main/java/org/apache/nifi/avro/AvroTypeUtil.java].
> Specifically there are bugs in the convertUnionFieldValue method and in the 
> buildAvroSchema method. Both of these methods make the assumption that an 
> Avro union can only contain one child type of each type. As stated in the 
> spec, this is true for primitive types and non-named complex types but not 
> for named types.
>  There may be related bugs elsewhere, but I haven't been able to locate them 
> yet.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5735) Record-oriented processors/services do not properly support Avro Unions

2018-11-05 Thread Alex Savitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Savitsky updated NIFI-5735:

Attachment: NIFI-5735.patch

> Record-oriented processors/services do not properly support Avro Unions
> ---
>
> Key: NIFI-5735
> URL: https://issues.apache.org/jira/browse/NIFI-5735
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework, Extensions
>Affects Versions: 1.7.1
>Reporter: Daniel Solow
>Priority: Major
>  Labels: AVRO, avro
> Attachments: 
> 0001-NIFI-5735-added-preliminary-support-for-union-resolu.patch, 
> NIFI-5735.patch
>
>
> The [Avro spec|https://avro.apache.org/docs/1.8.2/spec.html#Unions] states:
> {quote}Unions may not contain more than one schema with the same type, 
> *except for the named types* record, fixed and enum. For example, unions 
> containing two array types or two map types are not permitted, but two types 
> with different names are permitted. (Names permit efficient resolution when 
> reading and writing unions.)
> {quote}
> However record oriented processors/services in Nifi do not support multiple 
> named types per union. This is a problem, for example, with the following 
> schema:
> {code:javascript}
> {
> "type": "record",
> "name": "root",
> "fields": [
> {
> "name": "children",
> "type": {
> "type": "array",
> "items": [
> {
> "type": "record",
> "name": "left",
> "fields": [
> {
> "name": "f1",
> "type": "string"
> }
> ]
> },
> {
> "type": "record",
> "name": "right",
> "fields": [
> {
> "name": "f2",
> "type": "int"
> }
> ]
> }
> ]
> }
> }
> ]
> }
> {code}
>  This schema contains a field name "children" which is array of type union. 
> The union type contains two possible record types. Currently the Nifi avro 
> utilities will fail to process records of this schema with "children" arrays 
> that contain both "left" and "right" record types.
> I've traced this bug to the [AvroTypeUtils 
> class|https://github.com/apache/nifi/blob/rel/nifi-1.7.1/nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-avro-record-utils/src/main/java/org/apache/nifi/avro/AvroTypeUtil.java].
> Specifically there are bugs in the convertUnionFieldValue method and in the 
> buildAvroSchema method. Both of these methods make the assumption that an 
> Avro union can only contain one child type of each type. As stated in the 
> spec, this is true for primitive types and non-named complex types but not 
> for named types.
>  There may be related bugs elsewhere, but I haven't been able to locate them 
> yet.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5731) Support List-typed fields for Array coercion

2018-10-19 Thread Alex Savitsky (JIRA)
Alex Savitsky created NIFI-5731:
---

 Summary: Support List-typed fields for Array coercion
 Key: NIFI-5731
 URL: https://issues.apache.org/jira/browse/NIFI-5731
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core Framework
Affects Versions: 1.7.1
Reporter: Alex Savitsky


This Jira concerns the nifi-record module specifically.

Currently, the toArray method of 
org.apache.nifi.serialization.record.util.DataTypeUtils does not support values 
of type java.util.List to convert to Object[], despite the conversion being a 
trivial one-liner:
{code:java}
return ((List) value).toArray();
{code}
The Record being converted doesn't always have the control to enforce all its 
arrays being actual arrays and not collections (e.g., if the Record was created 
in Groovy), and it would be nice to have it converted on the fly, rather than 
having to transform it manually.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5699) Make the property descriptors accessible to the subclasses of AbstractRecordProcessor

2018-10-15 Thread Alex Savitsky (JIRA)
Alex Savitsky created NIFI-5699:
---

 Summary: Make the property descriptors accessible to the 
subclasses of AbstractRecordProcessor
 Key: NIFI-5699
 URL: https://issues.apache.org/jira/browse/NIFI-5699
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core Framework
Affects Versions: 1.7.1
Reporter: Alex Savitsky


Currently, RECORD_READER and RECORD_WRITER are declared with default 
visibility, but the class is abstract. I don't think it's reasonable to expect 
all subclasses to reside in org.apache.nifi.processors.standard to be able to 
access the property descriptors (e.g. for testing). Please make them public, 
they're final anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)