tpalfy commented on code in PR #6794:
URL: https://github.com/apache/nifi/pull/6794#discussion_r1108665694
##########
nifi-nar-bundles/nifi-salesforce-bundle/nifi-salesforce-processors/src/main/java/org/apache/nifi/processors/salesforce/QuerySalesforceObject.java:
##########
@@ -224,6 +282,8 @@ protected List<PropertyDescriptor>
getSupportedPropertyDescriptors() {
return Collections.unmodifiableList(Arrays.asList(
API_URL,
Review Comment:
When switching between PROPERTY_BASED and CUSTOM_QUERY the list of
properties on the processor changes in a confusing manner because dependent and
non-dependent properties are intermingled.
Can we group them a bit better and move the non-dependent properties either
before or after the dependent ones?
See example below.
Also CREATE_ZERO_RECORD_FILES doesn't work with CUSTOM_QUERY.
```java
API_URL,
API_VERSION,
QUERY_TYPE,
CUSTOM_SOQL_QUERY,
SOBJECT_NAME,
FIELD_NAMES,
RECORD_WRITER,
AGE_FIELD,
INITIAL_AGE_FILTER,
AGE_DELAY,
CUSTOM_WHERE_CONDITION,
READ_TIMEOUT,
CREATE_ZERO_RECORD_FILES,
TOKEN_PROVIDER
```
##########
nifi-nar-bundles/nifi-salesforce-bundle/nifi-salesforce-processors/src/main/java/org/apache/nifi/processors/salesforce/QuerySalesforceObject.java:
##########
@@ -106,13 +111,55 @@
@DefaultSchedule(strategy = SchedulingStrategy.TIMER_DRIVEN, period = "1 min")
public class QuerySalesforceObject extends AbstractProcessor {
+ static final AllowableValue PROPERTY_BASED_QUERY = new
AllowableValue("property-based-query", "Property Based Query", "Provide query
by properties.");
+ static final AllowableValue CUSTOM_QUERY = new
AllowableValue("custom-query", "Custom Query", "Provide custom SOQL query.");
+
+ static final PropertyDescriptor API_URL = new PropertyDescriptor.Builder()
Review Comment:
Why can't we use `API_URL` and `API_VERSION` from
`CommonSalesforceProperties`?
##########
nifi-nar-bundles/nifi-salesforce-bundle/nifi-salesforce-processors/src/main/java/org/apache/nifi/processors/salesforce/QuerySalesforceObject.java:
##########
@@ -261,6 +321,23 @@ protected Collection<ValidationResult>
customValidate(ValidationContext validati
@Override
public void onTrigger(final ProcessContext context, final ProcessSession
session) throws ProcessException {
+ boolean isCustomQuery =
CUSTOM_QUERY.getValue().equals(context.getProperty(QUERY_TYPE).getValue());
+ AtomicReference<String> nextRecordsUrl = new AtomicReference<>();
+ AtomicReference<String> totalSize = new AtomicReference<>();
Review Comment:
These `AtomicReference` objects are not used here. I don't think their
creation should be the responsibility of `onTrigger`, rather that of the
downstream methods.
The `nextRecordsUrl` constructor would be explicitly called both in those
methods but I wouldn't consider that a duplication. These being closer to the
`do...while` block they are used in would help understand their purpose much
better.
##########
nifi-nar-bundles/nifi-salesforce-bundle/nifi-salesforce-processors/src/main/java/org/apache/nifi/processors/salesforce/QuerySalesforceObject.java:
##########
@@ -387,11 +461,63 @@ public void onTrigger(final ProcessContext context, final
ProcessSession session
} while (nextRecordsUrl.get() != null);
}
- private InputStream getResultInputStream(AtomicReference<String>
nextRecordsUrl, String querySObject) {
- if (nextRecordsUrl.get() == null) {
+ private void processCustomQuery(ProcessContext context, ProcessSession
session, FlowFile flowFile, AtomicReference<String> nextRecordsUrl,
AtomicReference<String> totalSize) {
+ String customQuery =
context.getProperty(CUSTOM_SOQL_QUERY).evaluateAttributeExpressions(flowFile).getValue();
+ do {
+ FlowFile outgoingFlowFile;
+ if (flowFile != null) {
+ outgoingFlowFile = session.create(flowFile);
+ } else {
+ outgoingFlowFile = session.create();
+ }
+ Map<String, String> attributes = new HashMap<>();
+ try (InputStream response =
getResultInputStream(nextRecordsUrl.get(), customQuery)) {
+ outgoingFlowFile = session.write(outgoingFlowFile,
parseHttpResponse(response, nextRecordsUrl, totalSize));
+ int recordCount = nextRecordsUrl.get() != null ? 2000 :
Integer.parseInt(totalSize.get()) % 2000;
+ attributes.put(CoreAttributes.MIME_TYPE.key(),
"application/json");
+ attributes.put(TOTAL_RECORD_COUNT,
String.valueOf(recordCount));
+ session.adjustCounter("Salesforce records processed",
recordCount, false);
+ session.putAllAttributes(outgoingFlowFile, attributes);
+ session.transfer(outgoingFlowFile, REL_SUCCESS);
+ } catch (IOException e) {
+ session.remove(outgoingFlowFile);
+ throw new ProcessException("Couldn't get Salesforce records",
e);
+ }
+ } while (nextRecordsUrl.get() != null);
+ if (flowFile != null) {
+ session.remove(flowFile);
Review Comment:
Since we _can_ have incoming flowfiles, we should consider adding an
ORIGINAL and a FAILURE relationship.
##########
nifi-nar-bundles/nifi-salesforce-bundle/nifi-salesforce-processors/src/main/java/org/apache/nifi/processors/salesforce/QuerySalesforceObject.java:
##########
@@ -387,11 +461,63 @@ public void onTrigger(final ProcessContext context, final
ProcessSession session
} while (nextRecordsUrl.get() != null);
}
- private InputStream getResultInputStream(AtomicReference<String>
nextRecordsUrl, String querySObject) {
- if (nextRecordsUrl.get() == null) {
+ private void processCustomQuery(ProcessContext context, ProcessSession
session, FlowFile flowFile, AtomicReference<String> nextRecordsUrl,
AtomicReference<String> totalSize) {
+ String customQuery =
context.getProperty(CUSTOM_SOQL_QUERY).evaluateAttributeExpressions(flowFile).getValue();
+ do {
+ FlowFile outgoingFlowFile;
+ if (flowFile != null) {
+ outgoingFlowFile = session.create(flowFile);
+ } else {
+ outgoingFlowFile = session.create();
+ }
+ Map<String, String> attributes = new HashMap<>();
+ try (InputStream response =
getResultInputStream(nextRecordsUrl.get(), customQuery)) {
+ outgoingFlowFile = session.write(outgoingFlowFile,
parseHttpResponse(response, nextRecordsUrl, totalSize));
+ int recordCount = nextRecordsUrl.get() != null ? 2000 :
Integer.parseInt(totalSize.get()) % 2000;
+ attributes.put(CoreAttributes.MIME_TYPE.key(),
"application/json");
+ attributes.put(TOTAL_RECORD_COUNT,
String.valueOf(recordCount));
+ session.adjustCounter("Salesforce records processed",
recordCount, false);
+ session.putAllAttributes(outgoingFlowFile, attributes);
+ session.transfer(outgoingFlowFile, REL_SUCCESS);
+ } catch (IOException e) {
+ session.remove(outgoingFlowFile);
+ throw new ProcessException("Couldn't get Salesforce records",
e);
+ }
+ } while (nextRecordsUrl.get() != null);
+ if (flowFile != null) {
+ session.remove(flowFile);
+ }
+ }
+
+ private OutputStreamCallback parseHttpResponse(final InputStream in,
AtomicReference<String> nextRecordsUrl, AtomicReference<String> totalSize) {
+ nextRecordsUrl.set(null);
+ return out -> {
+ try (JsonParser jsonParser = JSON_FACTORY.createParser(in);
+ JsonGenerator jsonGenerator =
JSON_FACTORY.createGenerator(out, JsonEncoding.UTF8)) {
+ while (jsonParser.nextToken() != null) {
+ if (jsonParser.getCurrentToken() == JsonToken.FIELD_NAME
&& jsonParser.getCurrentName()
+ .equals(TOTAL_SIZE)) {
+ jsonParser.nextToken();
+ totalSize.set(jsonParser.getValueAsString());
+ } else if (jsonParser.getCurrentToken() ==
JsonToken.FIELD_NAME && jsonParser.getCurrentName()
+ .equals(NEXT_RECORDS_URL)) {
+ jsonParser.nextToken();
+ nextRecordsUrl.set(jsonParser.getValueAsString());
+ } else if (jsonParser.getCurrentToken() ==
JsonToken.FIELD_NAME && jsonParser.getCurrentName()
+ .equals(RECORDS)) {
+ jsonParser.nextToken();
+ jsonGenerator.copyCurrentStructure(jsonParser);
+ }
Review Comment:
Bit of a code duplication here.
I would consider something like
```suggestion
while (jsonParser.nextToken() != null) {
if (nextTokenIs(jsonParser, JsonToken.FIELD_NAME,
TOTAL_SIZE)) {
totalSize.set(jsonParser.getValueAsString());
} else if (nextTokenIs(jsonParser, JsonToken.FIELD_NAME,
NEXT_RECORDS_URL)) {
nextRecordsUrl.set(jsonParser.getValueAsString());
} else if (nextTokenIs(jsonParser, JsonToken.FIELD_NAME,
RECORDS)) {
jsonGenerator.copyCurrentStructure(jsonParser);
}
```
with
```java
private boolean nextTokenIs(JsonParser jsonParser, JsonToken fieldName,
String value) throws IOException {
return jsonParser.getCurrentToken() == fieldName &&
jsonParser.getCurrentName()
.equals(value) && jsonParser.nextToken() != null;
}
```
I do something uncommon here because this boolean method not only checks
something but also moves to the next token if the corresponding check evaluates
to true. But this type of behaviour in parsers are usually acceptable.
##########
nifi-nar-bundles/nifi-salesforce-bundle/nifi-salesforce-processors/src/main/java/org/apache/nifi/processors/salesforce/QuerySalesforceObject.java:
##########
@@ -162,6 +212,7 @@ public class QuerySalesforceObject extends
AbstractProcessor {
.expressionLanguageSupported(ExpressionLanguageScope.VARIABLE_REGISTRY)
.addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
.dependsOn(AGE_FIELD)
+ .dependsOn(QUERY_TYPE, PROPERTY_BASED_QUERY)
Review Comment:
Currently CUSTOM_QUERY does not support incremental loading. That is
probably intentional but in that case we should emphasize this in the
`additionalDetails.html` page and maybe even in the `@CapabilityDescription`.
##########
nifi-nar-bundles/nifi-salesforce-bundle/nifi-salesforce-processors/src/main/resources/docs/org.apache.nifi.processors.salesforce.QuerySalesforceObject/additionalDetails.html:
##########
@@ -34,8 +34,8 @@ <h3>Description</h3>
<p>
Objects in Salesforce are database tables, their rows are known as
records, and their columns are called fields. The QuerySalesforceObject
processor queries Salesforce objects and retrieves their records.
- The processor constructs the query using SOQL (Salesforce Object Query
Language) and retrieves the result record dataset using the Salesforce REST API.
- The processor utilizes streams and NiFi record-based processing to be able
to handle a large number of records and to allow arbitrary output format.
+ The processor constructs the query from parameters or executes a custom
SOQL (Salesforce Object Query Language) query and retrieves the result record
dataset using the Salesforce REST API.
Review Comment:
```suggestion
The processor constructs the query from processor properties or executes
a custom SOQL (Salesforce Object Query Language) query and retrieves the result
record dataset using the Salesforce REST API.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]