[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 Thanks @MikeThomsen @pvillard31 @mattyb149 @joewitt @jskora and everyone for your advice/support. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user pvillard31 commented on the issue: https://github.com/apache/nifi/pull/2101 Did some additional local tests. LGTM. +1, merging to master, thank you all. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 @pvillard31 @mattyb149 @MikeThomsen I've added expression language support for username and password. Please let me know if there is any other recommendation. Thanks Mans ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mattyb149 commented on the issue: https://github.com/apache/nifi/pull/2101 The L&N LGTM :) the MIT license reference goes in the assembly LICENSE and the NAR license (which it is), and nothing in the NOTICE unless its transitive dependencies have NOTICEs, I verified only Apache Commons Lang does, and it is included. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user pvillard31 commented on the issue: https://github.com/apache/nifi/pull/2101 It looks OK to me, I'd just suggest allowing EL on username/password. Will probably be useful with the Registry stuff. I know sensitive properties are not handled by the registry work yet, but that could be in the future. Before giving a final +1, I'd appreciate another opinion on L&N aspects but it LGTM. Thanks for all your work @mans2singh ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 @pvillard31 - You were right - After renaming the integration tests they don't get executed by default mvn install. Also removed test profile from pom.xml as recommended. Please let me know if you have any additional recommendation. Thanks again. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2101 @pvillard31 @alopresto Could one of you take a look? I think @mans2singh has done an excellent job addressing our concerns. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 Hi @MikeThomsen @mattyb149 @joewitt I believe I have implemented all the review changes. Please let me know if there is anything I have missed or you have any additional recommendations. Thanks for your feedback. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 @MikeThomsen - If the fields, tags and timestamp for a measurement are the same, they are considered to be the same record. Regarding size limit - I did not find any mention of size limit for posting data in the influxdb docs. I think this all depends on use cases and with the size limit processor property available, the nifi admin configure the values based on their influx db cluster and load. Let me know if I have missed anything or anything else required. Thanks again for your advice/comments. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 @mattyb149 @MikeThomsen Please let me know if you have any addiional comments/recommendations for this processor. Thanks ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 @mattyb149 I've updated the code based on your review feedback (added MIT license, updated property description, added expression language support for db name and size properties, and added connection timeout property, updated documentation regarding line protocol). I've also added retry relation (for handling socket timeout), and max size exceed relation to handle if records size exceeds limit. Unit/integration tests are also updated for these changes. Regarding max size limit - this is limit on the size of records being posted to InfluxDB. The processor can push a batch of records separated by new line to InfluxDB. I wanted to give the admin some control over how much data is being posted to InfluxDB. If the size is exceeded, the size exceed relation can be used to handle this scenario. Please let me know if you have any additional recommendations. Thanks again for your thoughtful feedback. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2101 @mans2singh LGTM, but we need a committer to sign off now. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 @MikeThomsen Thanks for the test file and docker-compose. Let me know if there is anything else required. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2101 Here's a simple test that generates a new write every 3 seconds. [PutInfluxDb_Test.xml.gz](https://github.com/apache/nifi/files/1671417/PutInfluxDb_Test.xml.gz) This is the query I am using in Chronograf: ``` SELECT mean("humidity") AS "mean_humidity" FROM "test_data"."autogen"."water" WHERE time > now() - 1h GROUP BY :interval: FILL(null) ``` And here is a docker-compose file you can use to set up the backend: ``` version: '2' services: influxdb: image: influxdb:latest #volumes: # - ./data/influxdb:/var/lib/influxdb ports: - "8086:8086" - "8083:8083" - "2003:2003" environment: INFLUXDB_GRAPHITE_ENABLED: "true" INFLUXDB_ADMIN_ENABLED: "true" # Define a Chronograf service chronograf: image: chronograf #volumes: # - ./data/chronograf:/var/lib/chronograf ports: - "1:1" - ":" links: - influxdb ``` Web UI on port for testing. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2101 +1 LGTM. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2101 Testing it out now. Will let you know if actually using it I observe anything weird. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 @MikeThomsen @mattyb149 I've changed the validator and updated the unit test cases for empty username and password. Please let me know if you have any other comments/recommendations. Thanks Mans ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2101 @mans2singh This works in the unit test per @mattyb149 's point: ``` runner.assertValid(); runner.setProperty(PutInfluxDB.PASSWORD, ""); runner.assertNotValid(); ``` And that's with `NON_EMPTY_VALIDATOR` being used. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mattyb149 commented on the issue: https://github.com/apache/nifi/pull/2101 By setting the NON_BLANK or NON_EMPTY validator, you are prohibiting someone from using an empty string as a username or password, which is what you want, right? So that unit test should fail because you are setting an empty string on USERNAME and then asserting that it is valid, when it is not. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2101 @mattyb149 That sounds like the validator is being called even when it's not a required property. Is there a good reason for NiFi to do that? ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 @mattyb149 @MikeThomsen If I use `NON_EMPTY_VALIDATOR` - I get the validation exception. I am including the exception, property definition and unit test snippets below. Please let me know if I am missing anything. Thanks. Here is the unit test validation exception with `NON_EMPTY_VALIDATOR`: ` java.lang.AssertionError: Processor has 1 validation failures: 'influxdb-username' validated against '' is invalid because influxdb-username cannot be empty ` Here is how I am defining the property ``` public static final PropertyDescriptor USERNAME = new PropertyDescriptor.Builder() .name("influxdb-username") .displayName("Username") .required(false) .description("Username for accessing InfluxDB") .addValidator(StandardValidators.NON_EMPTY_VALIDATOR) .build(); ``` and here is the unit test: ``` @Test public void testEmptyUsername() { runner.setProperty(PutInfluxDB.USERNAME,""); runner.assertValid(); } ``` ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mattyb149 commented on the issue: https://github.com/apache/nifi/pull/2101 Probably NON_EMPTY_VALIDATOR? ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2101 @mattyb149 Do you have a recommendation for the validator [to use here]( https://github.com/apache/nifi/pull/2101#issuecomment-361025124)? ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 Hello @MikeThomsen Thanks for your feedback. Based on your feedback I've update the code (added braces, corrected logging, removed state variables not required from the put processor and added integration tests profile). Regarding the timestamp issue - I forgot to mention that InfluxDB requires timestamp in nano-seconds resolution so we need to multiply the timestamp with million (I've updated the documentation also). I've tested with the following template in generate processor: `water,country=US,city=sf rain=1,humidity=0.6 ${now():toNumber():multiply(100) }` and I see the following in the influx shell: ``` time city country humidity rain --- 2018-01-27T23:31:46.331Z sf US 0.6 1 ``` Please let me know if you have any additional advice/comments for me. Thanks again. Mans ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2101 @mans2singh Figured out the issue, it's a time conversion. I added ```:multiply(100)``` to the EL and it worked. InfluxDB uses nanoseconds; I was supplying milliseconds. @joewitt FWIW +1 LGTM on merging. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2101 Everything I insert gets defaulted to some time around the epoch. I tried your sample, but it doesn't seem to work for me at least (data gets written, but it doesn't have the appropriate timestamp): ``` water,country=US,city=sf rain=1,humidity=0.6 ${now():toNumber()} ``` Ex output: ``` water,country=US,city=sf rain=1,humidity=0.6 1516999446718 ``` Using this query in Chronograf, I get the following results: ``` SELECT * FROM "test_data"."autogen"."water" ``` https://user-images.githubusercontent.com/108184/35459708-4ccb9ba6-02af-11e8-852f-7aeabd1806dc.png";> So overall LGTM on code quality, but we need to figure out if this is just something wrong on my end or with the processor before someone merges it. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 @MikeThomsen - Thanks for your feedback. Please let me know if I can assist in anyway. Mans ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2101 I'm going to try to get some time today and tomorrow to go through the code itself, but I was able to build a flow that uses the line protocol as described, and it successfully inserted multiple entries into InfluxDB. So... so far, so good. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 @MikeThomsen - Thanks for your review comments. I've updated the version and rebased the code. Regarding the error - InfluxDB uses line protocol for writing data (I've mentioned that in the capabilities section of the processor) - [https://docs.influxdata.com/influxdb/v1.4/write_protocols/line_protocol_tutorial/] to for writing data. I have integration/unit test that use the protocol for good and bad format scenarios - - integration test - [https://github.com/apache/nifi/pull/2101/files#diff-6aa4a2accea7ae7b489e2a219fb777f2] - unit tests - [https://github.com/apache/nifi/pull/2101/files#diff-9f4273a51b147a9e22085ef842726664] If you are using the flow file generator with the line protocol - the timestamp (last field) is a number - so please use `${now():toNumber()}` rather than `${now():toString()}` as shown in the sample measurement below: `water,country=US,city=sf rain=1,humidity=0.6 ${now():toNumber()}` Please let me know of you have any additional comments/recommendations or if there is anything else required. Thanks ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2101 @mans2singh Do you have an example of valid JSON input I can use to test this? I grabbed an example from here: https://github.com/influxdata/influxdb-python That's the Python client. I'm using this in a GenerateFlowFile processor: `` { "measurement": "cpu_load_short", "tags": { "host": "server01", "region": "us-west" }, "time": "${now():toString()}", "fields": { "value": 0.64 } } `` It doesn't like that. Throws this exception: ``` org.influxdb.InfluxDBException: {"error":"unable to parse '{': missing fields\nunable to parse '\"measurement\": \"cpu_load_short\",': invalid field format\nunable to parse '\"tags\": {': invalid field format\nunable to parse '\"host\": \"server01\",': invalid field format\nunable to parse '\"region\": \"us-west\"': invalid field format\nunable to parse '},': missing tag key\nunable to parse '\"time\": \"Sun Jan 21 16:48:12 EST 2018\",': invalid field format\nunable to parse '\"fields\": {': invalid field format\nunable to parse '\"value\": 0.64': invalid field format\nunable to parse '}': missing fields\nunable to parse '}': missing fields"} at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:511) at org.influxdb.impl.InfluxDBImpl.write(InfluxDBImpl.java:325) at org.apache.nifi.processors.influxdb.PutInfluxDB.writeToInfluxDB(PutInfluxDB.java:169) at org.apache.nifi.processors.influxdb.PutInfluxDB.onTrigger(PutInfluxDB.java:149) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1122) at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147) at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:128) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ``` If you can help me out with the test data, I can keep going with the review. ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 @joewitt I've removed the MIT License statements. Please let me know if there are any other comments/recommndations. Thanks ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 Hey Folks - Please let me know your feedback/suggestions for this time-series db processor. Thanks ---
[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor
Github user mans2singh commented on the issue: https://github.com/apache/nifi/pull/2101 Hey Nifi Team: Please let me know your feedback/comments on this processor. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---