Re: Wait processor doesn't route expired flowfiles to 'expired' relationship

2020-12-20 Thread Koji Kawamura
Hi Luca, Sorry to hear that you are having an issue with Wait processor. By looking at the code and testing it locally, I couldn't find a cause of the issue nor reproduce it. However, theoretically such a situation can happen when there are too many queued FlowFiles in the connection in front of

Re: Supporting Elasticsearch scrolling with an input flow file

2019-11-12 Thread Koji Kawamura
Hi Tim, Sorry for the late reply. It seems the ScrollElasticsearchHttp processor is designed to run a one-shot query to import query results from Elasticsearch. The description says "The state must be cleared before another query can be run." It tracks progress using managed state, not via

Re: Re: MergeRecord can not guarantee the ordering of the input sequence?

2019-10-20 Thread Koji Kawamura
look up the MergeRecord code and do further debug. > > Thanks, > Lei > > > > > wangl...@geekplus.com.cn > From: Koji Kawamura > Date: 2019-10-16 09:46 > To: users > CC: dev > Subject: Re: MergeRecord can not guarantee the ordering of the input sequence? >

Re: MergeRecord can not guarantee the ordering of the input sequence?

2019-10-15 Thread Koji Kawamura
Hi Lei, How about setting FIFO prioritizer at all the preceding connections before the MergeRecord? Without setting any prioritizer, FlowFile ordering is nondeterministic. Thanks, Koji On Tue, Oct 15, 2019 at 8:56 PM wangl...@geekplus.com.cn wrote: > > > If FlowFile A, B, C enter the

Re: Data inconsistency happens when using CDC to replicate my database

2019-10-15 Thread Koji Kawamura
Hi Lei, To address FlowFile ordering issue related to CaptureChangeMySQL, I'd recommend using EnforceOrder processor and FIFO prioritizer before a processor that requires precise ordering. EnforceOrder can use "cdc.sequence.id" attribute. Thanks, Koji On Tue, Oct 15, 2019 at 1:14 PM

Re: Apache NIFI report:Caused by: java.lang.OutOfMemoryError: Compressed class space

2019-10-10 Thread Koji Kawamura
Hello, Did you check nifi-bootstrap.log? Since the output is logged to stdout, such information is logged to nifi-bootstrap.log instead of nifi-app.log. Thanks, Koji On Thu, Oct 10, 2019 at 8:14 PM abellnotring wrote: > Hi,all > I’m running two nodes NIFI cluster. some day, it reported an

Re: Can CaptureChangeMySQL be scheduled to all nodes instead of primary node?

2019-10-10 Thread Koji Kawamura
Hi Lei, I don't know any NiFi built-in feature to achieve that. To distribute CaptureChangeMySQL load among nodes, I'd deploy separate standalone NiFi (or even MiNiFi Java) in addition to the main NiFi cluster for the main data flow. For example, if there are 5 databases and 3 NiFi nodes, deploy

Re: NIFI - Fetchfile - Execute SQL - Put Database Record

2019-10-10 Thread Koji Kawamura
Hi Asmath, How about using PutSQL? FetchFile -> PutSQL -> PutDatabaseRecord You can specify a SQL statement at PutSQL 'SQL Statement' property, using FlowFile attribute. For example, delete from tbl where file_name = '${filename}' This way, the FlowFile content can be passed to PutDatabaseRecord

Re: How keep the from losing original content when a replacetext is performed for an invokehttp

2019-08-27 Thread Koji Kawamura
Hi William, Wait/Notify may be a possible approach. - UpdateAttribute --> ReplaceText --> InvokeHttp --> Notify --> Wait --> PutFile - Use UpdateAttribute to add an attribute named 'wait.id' via expression '${UUID()}' - Connect the 'success' relationship from UpdateAttribute to both ReplaceText

Re: Using ConvertRecord on compressed input

2019-07-30 Thread Koji Kawamura
try to write a patch for it. I agree with your suggestion to add > the support to specific record readers (e.g. `CSVReader`). > > On Mon, 29 Jul 2019 at 01:18, Koji Kawamura > wrote: > > > > Hello, > > > > Thanks for your question. I've posted my comment to the St

Re: Bug/Issue with ReplaceTextWithMapping

2019-07-30 Thread Koji Kawamura
ay that this locking was added later - version 1.9 or something? We >> are using 1.8. >> >> Thanks, >> Ameer Mawia >> >> On Thu, Jul 25, 2019 at 3:51 AM Koji Kawamura >> wrote: >> >>> Hi Ameer, >>> >>> Is the ReplaceTextW

Re: Using ConvertRecord on compressed input

2019-07-28 Thread Koji Kawamura
Hello, Thanks for your question. I've posted my comment to the StackOverflow question. I'd avoid adding it to the core package as some of Record formats handles compressed inputs by themselves, like Avro. http://apache-avro.679487.n3.nabble.com/read-a-compressed-avro-file-td3872899.html Adding

Re: Debugging info for a stuck SelectHiveQL processor

2019-07-25 Thread Koji Kawamura
Hi Pat, I recommend getting a thread-dump when you encounter the situation next time. Thread-dump shows what each thread is doing, including the stuck SelectHiveQL thread. You can get thread-dump by executing: ${NIFI_HOME}/bin/nifi.sh dump-file-name Then thread stack traces are logged to the

Re: Bug/Issue with ReplaceTextWithMapping

2019-07-25 Thread Koji Kawamura
Mawia wrote: > > Inline. > > On Mon, Jul 22, 2019 at 2:17 AM Koji Kawamura wrote: >> >> Hi Ameer, >> >> How is ReplaceTextWithMapping 'Mapping File Refresh Interval' configured? > > [Ameer] It is configured to 1sec - the lowest value allowed. >

Re: Bug/Issue with ReplaceTextWithMapping

2019-07-22 Thread Koji Kawamura
Hi Ameer, How is ReplaceTextWithMapping 'Mapping File Refresh Interval' configured? By default, it's set to '60s'. So, 1. If ReplaceTextWithMapping ran with the old mapping file 2. and the mapping file was updated for the next processing 3. then the flow started processing another CSV file right

Re: Nifi and SSL offloading

2019-07-07 Thread Koji Kawamura
Hi Nicolas, As you already know, all authentication methods implemented in NiFi require a secure connection. Each implementation class uses HttpServletRequest.isSecure method to determine if authentication is necessary. For example, JWTAuthenticationFilter:

Re: NiFi Wait / Notify not releasing on signal

2019-06-19 Thread Koji Kawamura
(Forgot clicking the SEND button...) Hi ara, thank you for sharing the issue. I've submitted a PR to add new 'Penalize Waiting FlowFiles' property to Wait processor. https://github.com/apache/nifi/pull/3538/files To make existing flows intact, it's disabled by default. If enabled, FlowFiles

Re: NiFi - create folders and files based on hostname and date

2019-05-22 Thread Koji Kawamura
Hello, In order to create the folders automatically, I would use PutFile like below: - Assuming the incoming FlowFile has attribute named 'host' so that PutFile's 'Directory' can refer it using NiFi Expression Language (EL) - The date part can also be generated using 'now()' EL function -

Re: execute process

2019-05-22 Thread Koji Kawamura
Hello, Which version of NiFi are you using? The relationship "nonzero status" is available since 1.5.0. NIFI-4559: ExecuteStreamCommand should have a failure relationship https://issues.apache.org/jira/browse/NIFI-4559 I hope you can try that. Thanks, Koji On Wed, May 22, 2019 at 11:15 PM

Re: Advice on orchestrating Nifi with dockerized external services

2019-04-10 Thread Koji Kawamura
Hi Eric, Although my knowledge on MiNiFi, Python and Go is limited, I wonder if "nanofi" library can be used from the proprietary application so that they can fetch FlowFiles directly using Site-to-Site protocol. That can be an interesting approach and will be able to eliminate the need of

Re: GetHbase state

2019-04-10 Thread Koji Kawamura
> of Hbase which is why we attempted scenario 3 (staged) below. We were very > surprised that we reloaded data under these conditions. > > Thanks again for your input. > > Dwane > > From: Koji Kawamura > Sent: Friday, 5 April 2019 6:0

Re: GetHbase state

2019-04-05 Thread Koji Kawamura
Hi Dwane, Does the Pig job puts HBase data with custom timestamps? For example, the loading data contains last_modified timestamp, and it's used as HBase cell timestamp. If that's the case, GetHbase may miss some HBase rows, unless the Pig job loads raws ordered by the timestamp when Pig and

Re: NiFi Registry Not Auditing Denied Errors

2019-04-04 Thread Koji Kawamura
Hi Shawn, The 'No applicable policies could be found.' message can be logged when a request is made against a resource which doesn't exist.

Re: Problem with load balancing option

2019-03-25 Thread Koji Kawamura
't see any mention of this new section within nifi.properties. It might > be good idea to add a section about this so that people upgrading their > cluster have all the information at hand. This might save them some time. > > Thanks all for your outstanding work > --

Re: Problem with load balancing option

2019-03-24 Thread Koji Kawamura
Hi, That looks similar to this one: Occasionally FlowFiles appear to get "stuck" in a Load-Balanced Connection https://issues.apache.org/jira/browse/NIFI-5919 If you're using NiFi 1.8.0, I recommend trying the latest 1.9.1 which has the fix for the above issue. Hope this helps. Koji On Sat,

Re: How can I ExtractGrok from end-of-file?

2019-03-19 Thread Koji Kawamura
Hello Eric, Have you found any solution for this? If your trailers (footer?) starts with certain byte sequence, then SplitContent may be helpful to split the content into Header+Payload, and the Trailers. If that works, then the subsequent flow can do something creative probably using

Re: Convert Avro to ORC or JSON processor - retaining the data type

2019-03-10 Thread Koji Kawamura
s Koji for the response. Our users want to run hiveql queries with some > comparators and cannot work with string type for numeric data type. > > Any other options? > > Thanks, > Ravi Papisetti > > On 07/03/19, 7:14 PM, "Koji Kawamura" wrote: > > Hi Ravi,

Re: QueryRecord and NULLs

2019-03-07 Thread Koji Kawamura
Using NULLIF can be a workaround. I was able to populate new columns with null. SELECT * ,NULLIF(5, 5) as unit_cerner_alias ,NULLIF(5, 5) as room_cerner_alias ,NULLIF(5, 5) as bed_cerner_alias FROM FLOWFILE On Fri, Mar 8, 2019 at 7:57 AM Boris Tyukin wrote: > > I am struggling for an hour now

Re: Convert Avro to ORC or JSON processor - retaining the data type

2019-03-07 Thread Koji Kawamura
Hi Ravi, I looked at following links, Hive does support some logical types like timestamp-millis, but not sure if decimal is supported. https://issues.apache.org/jira/browse/HIVE-8131 https://cwiki.apache.org/confluence/display/Hive/AvroSerDe#AvroSerDe-AvrotoHivetypeconversion If treating the

Re: Errors when attempting to use timestamp-millis fields with QueryRecord

2019-03-07 Thread Koji Kawamura
Hello, I believe this is a known issue. Unfortunately, querying against timestamp column is not supported. https://issues.apache.org/jira/browse/NIFI-5888 I'm working on fixing this at Calcite project, the sql execution engine underneath QueryRecord.

Re: Different NiFi Node sizes within same cluster

2019-03-07 Thread Koji Kawamura
> The last thing I'm looking to understand is what Byran B brought up, do load > balanced connections take into consideration the load of each node? No, load balanced connection doesn't use load of each node to calculate destination currently. As future improvement ideas. We can implement

Re: jolt transform spec ?

2019-03-06 Thread Koji Kawamura
Hello, I haven't tested myself, but using EvaluateJsonPath and ReplaceText with 'unescapeJson' EL function may be an alternative approach instead of Jolt. https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#unescapejson Idea is, use EvaluateJsonPath to extract the MSG part

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

2019-02-15 Thread Koji Kawamura
Hi Pat, Thanks for sharing your insights. I will try benchmarking before and after "gzip.setExcludedPath()" that Mark has suggested if it helps improving S2S HTTP throughput. Koji On Fri, Feb 15, 2019 at 9:31 AM Pat White wrote: > > Hi Andy, > > My requirement is to use https with minimum tls

Re: problem with merging of failed response

2019-01-14 Thread Koji Kawamura
Hello, If you're using InvokeHttp processor to call REST endpoint, enabling "Always Output Response" might be helpful to generate a FlowFile even if the result HTTP status code is not successful. Thanks, Koji On Fri, Jan 11, 2019 at 2:31 PM l vic wrote: > > I have to merge results of

Re: Wait/Notify inconsistent behavior

2019-01-08 Thread Koji Kawamura
of both, and all this in a finite > loop determined by the answers, Gathering all the answers in one final Json. > > > Thank you very much. > > LC > > > > - Mensaje original - > De: "Koji Kawamura" > Para: "users" > Enviados: Lunes, 7 de Enero

Re: Wait/Notify inconsistent behavior

2019-01-07 Thread Koji Kawamura
stuck in the wait/notify issue, thanks for the sample I'll look > into it. Then I will see how to get the loop. > > Thanks a lot, > > Regards, > > LC > > > - Mensaje original - > De: "Koji Kawamura" > Para: "users" > Enviados: Dom

Re: Wait/Notify inconsistent behavior

2019-01-06 Thread Koji Kawamura
Hi Luis, Just a quick question, how are the "Signal Counter Name" and "Target Signal Count" properties for the Wait processor configured? If you'd like to wait the two sub-flow branches to complete, then: "Signal Counter Name" should be blank, meaning check total count for all counter names

Re: SimpleCsvFileLookupService with LookupAttribute

2018-12-17 Thread Koji Kawamura
Hi Ryan, With following settings: # LookupAttribute (+dynamic) lookedUp=${someAttribute} # SimpleCsvFileLookupService CSV File=data.csv Lookup Key Column=id Lookup Value Column=value # data.csv id,value,desc one,1,the first number two,2,the 2nd number If a FlowFile having 'someAttribute'

Re: SQL Result to Attributes

2018-11-21 Thread Koji Kawamura
Hi Nick, I thought there was a discussion about adding a general database lookup service, but I can't find LookupService impl available that can fetch value from external databases. If there's such controller service, you can use LookupAttribute processor. (wondering if you're interested in

Re: Syntax to access freeSpace from REST API

2018-11-20 Thread Koji Kawamura
Hi Jim, The reason for 'contentRepositoryStorageUsage' being an array is that you can configure multiple content repository paths. In that case, each content repository can be mapped to different disk partitions and have own free space.

Re: Multiple NiFi clusters with 1 NiFi Rigistry

2018-11-20 Thread Koji Kawamura
for NiFi Registry using ldap-user-group-provider. > > So in Ranger and NiFi Registry, the users are just listed as ‘bob’ and not > ‘bob@dev’. That means I would manually have to add users to Ranger and NiFi > Registry to add the ‘@dev’ part right? Or is there a way to customize that

Re: Multiple NiFi clusters with 1 NiFi Rigistry

2018-11-19 Thread Koji Kawamura
Hi Chad, NiFi Registry uses NiFi user's identity to authorize request. Registry also checks NiFi instance's identity to authorize proxying user requests, but this can only authorize proxy capability. In order to control access such as bucket read/write, Registry uses NiFi user's identity. I

Re: CaptureChangeMySQL - throwing Binlog connector communications failure

2018-11-19 Thread Koji Kawamura
t we don't do anything for that > event and right after that we get the ERROR saying could not find next log > > What are your views on this? > > On Mon, Nov 19, 2018 at 11:28 AM Koji Kawamura wrote: >> >> > I face this scenario in every 2-3 days. And surprisingly, th

Re: CaptureChangeMySQL - throwing Binlog connector communications failure

2018-11-18 Thread Koji Kawamura
hyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:793) > at java.lang.Thread.run(Thread.java:745) > > > I face this scenario in every 2-3 days. And surprisingly, the binlog file > never changed still received ROTATE events, do you know any reason why MySQL > se

Re: NiFi 1.7.1 remote group not connecting when added through restful api until nifi restarted

2018-11-15 Thread Koji Kawamura
message when I do a restart and there are existing remote > process groups but they do seem to connect eventually > > It's only the initial create of the first remote process group that seems to > be acting weird. > > -----Original Message- > From: Koji Kawamura > Se

Re: Edit QueryCassandra processor using REST API.

2018-11-14 Thread Koji Kawamura
Hi Dnyaneshwar, You can terminate remaining thread forcefully by sending a DELETE request to /processors/{id}/threads. https://nifi.apache.org/docs/nifi-docs/rest-api/index.html Thanks, Koji On Thu, Nov 15, 2018 at 4:14 PM Dnyaneshwar Pawar wrote: > > Hi > > We are trying to edit the

Re: NiFi 1.7.1 remote group not connecting when added through restful api until nifi restarted

2018-11-14 Thread Koji Kawamura
Hello William, > fails to connect to the existing input port until I do a restart of NiFi Is there any error message when it fails? Connection refused? It should not require a NiFi restart to establish connection. Thanks, Koji On Thu, Nov 15, 2018 at 1:38 AM William Gosse wrote: > > I'm

Re: CaptureChangeMySQL - throwing Binlog connector communications failure

2018-11-14 Thread Koji Kawamura
from any binlog position? > > > On Tue, Nov 13, 2018 at 7:27 AM Koji Kawamura wrote: >> >> Hi Anand, >> >> I'm not sure what caused the error, but I believe the error is MySQL error >> 1236. >> The error can happen with different reasons, you may find t

Re: Parsing a template to identify processor names

2018-11-13 Thread Koji Kawamura
Hello, I'm not sure if this is what you're looking for, but I wrote a test case before that loads template and use flow information to automate tests. The code can be a reference.

Re: CaptureChangeMySQL - throwing Binlog connector communications failure

2018-11-12 Thread Koji Kawamura
24061. > > at > com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:882) > > at > com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:559) > > at > com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:793

Re: Problem of connection in Remote Process Group

2018-11-08 Thread Koji Kawamura
Hi Jean, If you haven't, please take a look on this documentation. There are few example configurations and deployment diagrams you can refer. https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#site_to_site_reverse_proxy_properties Also, here are some Nginx configurations that

Re: Nulls in input data throwing exceptions when using QueryRecord

2018-11-08 Thread Koji Kawamura
Hi Mandeep, Thanks for reporting the issue and detailed explanation. That's very helpful! I was able to reproduce the issue and found a possible solution. Filed a JIRA, a PR will be submitted shortly to fix it. https://issues.apache.org/jira/browse/NIFI-5802 Thanks, Koji On Wed, Nov 7, 2018 at

Re: Can I define variables using other variables (or expression language)?

2018-11-06 Thread Koji Kawamura
Hi Krzysztof, Currently, NiFi variables cannot refer other variables. But you may be able to achieve the expected result if you combine multiple variables within a single Expression Language. For example, if the "scratch_path" is the one you need to use at ListFile processor's "Input directory",

Re: Available SQL fn in QueryRecord??

2018-10-28 Thread Koji Kawamura
Hi Dano, Since QueryRecord uses Apache Calcite, it should support SQL functions supported by Calcite. Please check Calcite documentation. https://calcite.apache.org/docs/reference.html#arithmetic-operators-and-functions Thanks, Koji On Sat, Oct 27, 2018 at 5:45 AM dan young wrote: > > Hello, >

Re: how to merge attributes?

2018-10-23 Thread Koji Kawamura
Hello, InvokeHttp creates a new FlowFile for "Result" relationship from the incoming FlowFile. That means, the FlowFile for "Result" carries all attributes copied that the incoming one has. You just need to connect the "Result" relationship to ReplaceTest. "Original" can be auto-terminated in

Re: CaptureChangeMySQL - throwing Binlog connector communications failure

2018-10-23 Thread Koji Kawamura
Hello, Thanks for reporting the issue and the detailed analysis. I was able to reproduce by setting short wait_timeout as 30 sec. I'm not aware of any work-around to make keep alive CaptureChangeMySQL at the moment. But you can write a script to stop/start CaptureChangeMySQL processor using NiFi

Re: Issue with PutSQL

2018-10-22 Thread Koji Kawamura
Hi Vyshali, I was able to connect MySQL Server 8.0.12 with mysql-connector-java-8.0.12.jar successfully, using username/password. Here are the SQLs I used to create a MySQL user: CREATE USER 'nifi'@'%' IDENTIFIED BY 'password'; GRANT ALL PRIVILEGES ON * . * TO 'nifi'@'%'; Does your connection

Re: Cluster Peer Lists

2018-09-27 Thread Koji Kawamura
Hi Peter, Site-to-Site client refreshes remote peer list per 60 secs. https://github.com/apache/nifi/blob/master/nifi-commons/nifi-site-to-site-client/src/main/java/org/apache/nifi/remote/client/PeerSelector.java#L60 The address configured to setup a S2S client is used to get remote peer list

Re: Listing S3

2018-09-24 Thread Koji Kawamura
Hi Martijn, I'm not an expert on Jython, but if you already have a python script using boto3 working fine, then I'd suggest using ExecuteStreamCommand instead. For example: - you can design the python script to print out JSON formatted string about listed files - then connect the outputs to

Re: High volume data with ExecuteSQL processor

2018-09-24 Thread Koji Kawamura
Hello, Did you try setting 'Max Rows Per Flow File' at ExecuteSQL processor? If the OOM happened when NiFi writes all results into a single FlowFile, then the property can help breaking the result set into several FlowFiles to avoid that. Thanks, Koji On Fri, Sep 21, 2018 at 3:56 PM Dnyaneshwar

Re: Unable to see Nifi data lineage in Atlas

2018-07-29 Thread Koji Kawamura
Hi Mohit, >From the log message, I assume that you are using an existing atlas-application.properties copied from somewhere (most likely from HDP environment) and PLAINTEXTSASL is used in it. PLAINTEXTSASL is not supported by the ReportLineageToAtlas. As a work-around, please set 'Create Atlas

Re: RPG S2S Error

2018-07-29 Thread Koji Kawamura
rt 8c77c1b0-0164-1000--052fa54c's destination is >>> full; penalizing peer) at the same time i see the closing connection error. >>> I don't see a way to resolve the back pressure as we get continue stream of >>> data from the kafka which is then inserted int

Re: How many threads does Jetty use?

2018-07-18 Thread Koji Kawamura
I'd take a NiFi thread dump and analyze it with a thread dump analyzing tool. fastThread can be used to count number of thready grouped by thread group. Pretty handy. http://fastthread.io/ Thanks, Koji On Thu, Jul 19, 2018 at 6:41 AM, Peter Wicks (pwicks) wrote: > I know the default thread

Re: RPG S2S Error

2018-07-06 Thread Koji Kawamura
wFileServerProtocol.getRequestType(SocketFlowFileServerProtocol.java:147) >> at >> org.apache.nifi.remote.SocketRemoteSiteListener$1$1.run(SocketRemoteSiteListener.java:253) >> at java.lang.Thread.run(Thread.java:745) >> >> >> I notice this error is rep

Re: RPG S2S Error

2018-07-05 Thread Koji Kawamura
Hello, 1. The error message sounds like the client disconnects in the middle of Site-to-Site communication. Enabling debug log would show more information, by adding at conf/logback.xml. 2. I'd suggest checking if your 4 nodes receive data evenly (well distributed). Connection status history,

Re: Issue with http-notification service

2018-06-24 Thread Koji Kawamura
Hi Raman, Since you're using 'https' endpoint, I believe you need to configure HttpNotificationService with trust store settings. The NullPointerException can happen if OkHttpClient is null when the notification service tries to send notifications, that can happen if OkHttpClient is not

Re: Only get file when a set exists.

2018-06-06 Thread Koji Kawamura
2018 at 4:50 PM, Koji Kawamura wrote: > Hi Martijn, > > Thanks for sharing new information. > Here are couple of things to help debugging. > > # Debug Notify branch > 1. Stop Wait branch, to debug solely Notify branch function. Wait > processor deletes cache entry whe

Re: Only get file when a set exists.

2018-06-06 Thread Koji Kawamura
I can share my flowfile, but would have to email it to you directly, >> unfortunately I cannot share the flowfile publicly, and sanitising it to the >> extent that I can publicly share it would be difficult. >> >> Oh, we are using 1.6 >> >> Many thanks, >> >

Re: Only get file when a set exists.

2018-05-31 Thread Koji Kawamura
BTW, which version are you using? I hope it is 1.4.0 or higher. There was an issue having effects to your usage. https://issues.apache.org/jira/browse/NIFI-4028 On Thu, May 31, 2018 at 4:51 PM, Koji Kawamura wrote: > HI Martijn, > > I used the filename increment pattern based on y

Re: Only get file when a set exists.

2018-05-31 Thread Koji Kawamura
ount of sets. > > This configuration did exactly what we want, but unfortunately we had random > flowfiles stuck in the waitqueue for no apparent reason. > > Thanks, > > Martijn > > > > On 31 May 2018 at 05:23, Koji Kawamura wrote: >> >> The order of arri

Re: Only get file when a set exists.

2018-05-30 Thread Koji Kawamura
fy!!) > > Martijn > > On 31 May 2018 at 03:49, Koji Kawamura wrote: >> >> Glad to hear that was helpful. >> >> "4 same type for each extension", can be treated as "8 distinct types" >> if an extension is included in a type. >&g

Re: Only get file when a set exists.

2018-05-30 Thread Koji Kawamura
gt; >> >>> Hi Koji, >> >>> >> >>> Thank you for responding. I had adjusted the run schedule to closely >> >>> mimic our environment. We are expecting about 1 file per second or so. >> >>> We are also seeing some random "orp

Re: Only get file when a set exists.

2018-05-30 Thread Koji Kawamura
at we are looking for a set of 8 files - 4 x "ext1" and 4 x >>> "ext2" both with the same pattern: .ext1 or ext2 >>> >>> We found that the best way to make this work was to add another >>> wait/notify pair, each processor coming after the ones

Re: Hive connection Pool error

2018-05-29 Thread Koji Kawamura
Hello, Although I encountered various Kerberos related error, I haven't encountered that one. I tried to reproduce the same error by changing Kerberos related configuration, but to no avail. I recommend enabling Kerberos debug option for further debugging. You can add the option at

Re: Only get file when a set exists.

2018-05-27 Thread Koji Kawamura
Hi Martin, Alternative approach is using Wait/Notify processors. I have developed similar flow using those before, and it will work with your case I believe. A NiFi flow template is available here. https://gist.github.com/ijokarumawak/06b3b071eeb4d10d8a27507981422edd Hope this helps, Koji On

Re: MonitorActivity to PutSlack

2018-05-08 Thread Koji Kawamura
Hi Nick, You may find ExtractText useful to extract string from FlowFile content into FlowFile attribute. E.g. MonitorActivity -> ExtractText -> PutSlack Thanks, Koji On Wed, May 9, 2018 at 9:03 AM, Nick Carenza wrote: > Hey all, > > I can't find a way to send

Re: Nifi Remote Process Group FlowFile Distribution among nodes

2018-05-08 Thread Koji Kawamura
Hi Mohit, NiFi RPG batches multiple FlowFiles into the same Site-to-Site transaction, and the default batch settings are configured for higher throughput. If you prefer more granular distribution, you can lower the batch configurations from "Manage Remote Ports" context menu of a

Re: problem with Nifi / Atlas integration - has anyone some experience with this integration ?

2018-04-26 Thread Koji Kawamura
Hi Dominique, Thank you for your interest in NiFI and Atlas integration. I have some experience with that, and actually written the NiFi reporting task. I have two things in mind could be related to your situation. One is NIFI-4971, it's under being reviewed now. It fixes lineage reporting issue

Re: Execute multiple HQL statements in PutHiveQL or SelectHiveQL

2018-04-19 Thread Koji Kawamura
Hello Dejan, I tested SET property statements bundled with INSERT statement in a single FlowFile passed to PutHiveQL. The same warning message is logged as you reported. However, actual INSERT was successful, I confirmed new rows were inserted. Please let us know if not the case. Although the

Re: MergeRecord

2018-04-13 Thread Koji Kawamura
oc comment when inferring schema unfortunately. Thanks, Koji On Fri, Apr 13, 2018 at 4:11 PM, Koji Kawamura <ijokaruma...@gmail.com> wrote: > Hi, > > I've tested InferAvroSchema and MergeRecord scenario. > As you described, records are not merged as expected. > > The reaso

Re: MergeRecord

2018-04-13 Thread Koji Kawamura
> Thanks for the answer. > > The 20k is just the last test, I’ve tested with 100,1000, with an input queue > of 10k, and it doesn’t change anything. > > I will try to simplify the test case and to not use the inferred schema. > > Regards > >> Le 13 avr. 2018 à 04:50,

Re: Fine-grained control over when a NiFi processor can run

2018-04-13 Thread Koji Kawamura
Hello Tim, If I'd understand your requirement correctly, I'd first try using following flow using the existing processors: FetchDistributedMapCache -> RouteOnAttribute -- permit --> InvokeHTTP -- unmatched --> WhateverAlternativeRoute Assuming input FlowFiles have a timestamp denoting

Re: MergeRecord

2018-04-12 Thread Koji Kawamura
Hello, I checked your template. Haven't run the flow since I don't have sample input XML files. However, when I looked at the MergeRecord processor configuration, I found that: Minimum Number of Records = 2 Max Bin Age = 10 sec By briefly looked at MergeRecord source code, it expires a bin

Re: Multi Domains Nifi connection and UI acces

2018-04-12 Thread Koji Kawamura
Hello, NiFi 1.6.0 has been released and it adds new nifi.property to whitelist multiple hosts so that NiFi can be accessed by different hostnames. Please see NIFI-4761 for details. I recommend updating to 1.6.0. https://issues.apache.org/jira/browse/NIFI-4761 Thanks, Koji On Wed, Apr 11, 2018

Re: PutHiveQL (1.5.0) throws un-necessary NullPointerException when parsing the query

2018-03-12 Thread Koji Kawamura
(dropped dev) Hi Amit, Thanks for sharing that. The parsing query is added by NIFI-4545 in NiFi 1.5.0, to parse Hive query at Hive related processors so that it can extract table names into outgoing FlowFile attributes. Those attributes are used by ReportLineageToAtlas reporting task to report

Re: Create nested records

2018-02-13 Thread Koji Kawamura
Hi Charlie, Thanks for sharing the template. Following configurations for UpdateRecord did the flat to nested mapping: - Replacement Value Strategy: Record Path Value - Dynamic property: "/phone" = "/" It maps the flat record into /phone child record. Fields those are not included in the

Re: Secure NiFi 1.5 Behind NGINX/HAProxy

2018-02-07 Thread Koji Kawamura
Hi Ryan, Although I am not sure why you'd want to use http between the clients and Nginx, I was able to setup similar environment. I used LDAP provider instead of OpenID, but OpenID should work as well. The key is NOT provide any client certificate from clients (browser/API) and Nginx to NiFi, so

Re: Is it possible to join multiple columns to a record using single lookup service

2018-01-30 Thread Koji Kawamura
Hi Sangavi, Good question, I thought it can be a nice example to illustrate how to use LookupService. I wrote a simple Gist entry with a NiFi template file to do what you are looking for. "Join, Enrich multiple columns by looking up an external CSV file"

Re: adding a filename column to a csv to insert into a table

2018-01-30 Thread Koji Kawamura
Hi Austin, I think there are a couple of ways to do that: 1. UpdateRecord with CSVReader and CSVWriter, update a column with a Record Path and Expression Language, e.g. Add a dynamic property, key=/filename, value=${filename} 2. Use SpritText to sprit each CSV record into a FlowFile, then combine

Re: all of our schedule tasks not running/being scheduled....

2018-01-29 Thread Koji Kawamura
Hi Dan, If all available Timer Driven Thread are being used (or hang unexpectedly for some reason), then no processor can be scheduled. The number at the left top the NiFi UI under the NiFi logo shows the number of threads currently working. If you see something more than 0, then I'd recommend to

Re: Updating schedule information using REST API without disturbing other processor configuration

2018-01-28 Thread Koji Kawamura
Hi Ravi, How does your request JSON sent to the PUT /processors/{id} endpoint look like? If you don't need to update any processor properties, then you don't have to send /component/config/properties element in a request JSON. You can debug how NiFi UI sends REST requests using web browser

Re: Maximum-value Columns on QueryDatabaseTable

2018-01-22 Thread Koji Kawamura
Hi Alberto, Thanks for reporting the issue, I was able to reproduce the behavior you described. Although it's for Microsoft SQL Server, there has been an existing JIRA for the same issue, NIFI-4393. https://issues.apache.org/jira/browse/NIFI-4393 I've created a Pull Request to fix MS SQL square

Re: Get the failure reason from ValidateResult processor

2018-01-09 Thread Koji Kawamura
Hi Martin, I assume you wanted to ask about ValidateRecord. As you know, ValidateRecord processor emits ROUTE provenance events with 'Details' that explains validation error. E.g. "Records in this FlowFile were invalid for the following reasons: ; The following 1 fields were present in the

Re: ListS3 and FetchS3

2018-01-09 Thread Koji Kawamura
Hi Aruna, The resulted two FlowFiles have the same contents I guess, that is the PDF file you specified at FetchS3Object Object Key. The flow worked as follows actually: 1. ListS3 listed two FlowFiles, Ntl_15.csv and 11500509.pdf 2. FetchS3Object is executed twice for each incoming FlowFile 2-1.

Re: Replay Event UUID

2018-01-09 Thread Koji Kawamura
en we > changed to the WriteAheadProvenanceRepository (like on the server where we > first noticed this) we did see it. Have there been changes in that > implementation? Could this cause what we are seeing? > > Thanks! > Rotem > > > On 28 Dec 2017 3:35 am, "Koji Kawamura" <ijokaruma...@gmail.c

Re: Bug/Unexpected behavior in ConvertJSONToSQL for boolean attributes

2018-01-02 Thread Koji Kawamura
uteSQL and ConvertRecord 1 -> true, 0 -> false) using following NiFi flow. https://gist.github.com/ijokarumawak/5b8d7dd5d799764dfd13dc6195025785 I hope this to get merged soon and available in the next release. Thanks, Koji On Wed, Jan 3, 2018 at 8:13 AM, Koji Kawamura <ijokaruma...@g

Re: Bug/Unexpected behavior in ConvertJSONToSQL for boolean attributes

2018-01-02 Thread Koji Kawamura
Hi Jennifer, Thank you very much for reporting this. It seems the line converts a Boolean to "0" or "1" at ConvertJSONToSQL is implemented wrongly. Looks like a careless mistake. Sorry for the inconvenience.

Re: Replay Event UUID

2017-12-27 Thread Koji Kawamura
Hi Rotem, When I tested it with NiFi 1.5.0-SNAPSHOT, the REPLAY event has its FlowFile UUID as the parent (original) FlowFile UUID as expected. Type REPLAY FlowFile Uuid 8c61fdd7-c084-4756-946c-f5669dc4442d File Size 4 bytes Component Id 9abc21e3-0160-1000-6d6f-a1c408f75b7a Component Name

Re: Clarification on load distribution on NiFi cluster

2017-12-21 Thread Koji Kawamura
Hi Ravi, To distribute QueryDatabaseTable workload, I'd suggest using GenerateTableFetch instead. Because it can generate SQLs to query updated records. And those SQL FlowFiles can be distributed among NiFi nodes by RPG. - Following lines are just to share my thoughts on the topic for

Re: GetSFTP error

2017-11-15 Thread Koji Kawamura
Hello, I haven't tried it myself, but from the stacktrace and Jsch souce code, I think you should specify a file in pkcs8 format, instead of pkcs12. Jsch will leave keypair null if it fails to parse it, that may be the cause of the NullPointerException. For converting a pem file to a pkcs8,

  1   2   >