Hi Mark

All the files in my testflow are 1GB files. But it happens in my production 
flow with different file sizes. 

When these issues have happened, I have the flowfile routed to an 
updateAttribute process which is disabled. Just to keep the file in a queue. 
Enable the process and sent the file back to a new hash calculation, the file 
is OK. So I don’t think the test with backup and compare makes any sense to do. 

Regards 
Jens

> Den 3. nov. 2021 kl. 15.57 skrev Mark Payne <[email protected]>:
> 
> So what I found interesting about the histogram output was that in each case, 
> the input file was 1 GB. The number of bytes that differed between the ‘good’ 
> and ‘bad’ hashes was something like 500-700 bytes whose values were 
> different. But the values ranged significantly. There was no indication that 
> the type of thing we’ve seen with NFS mounts was happening, where data was 
> nulled out until received and then updated. If that had been the case we’d 
> have seen the NUL byte (or some other value) have a very significant change 
> in the histogram, but we didn’t see that.
> 
> So a couple more ideas that I think can be useful.
> 
> 1) Which garbage collector are you using? It’s configured in the 
> bootstrap.conf file
> 
> 2) We can try to definitively prove out whether the content on the disk is 
> changing or if there’s an issue reading the content. To do this:
> 
> 1. Stop all processors.
> 2. Shutdown nifi
> 3. rm -rf content_repository; rm -rf flowfile_repository   (warning, this 
> will delete all FlowFiles & content, so only do this on a dev/test system 
> where you’re comfortable deleting it!)
> 4. Start nifi
> 5. Let exactly 1 FlowFile into your flow.
> 6. While it is looping through, create a copy of your entire Content 
> Repository: cp -r content_repository content_backup1; zip content_backup1.zip 
> content_backup1
> 7. Wait for the hashes to differ
> 8. Create another copy of the Content Repository: cp -r content_repository 
> content_backup2
> 9. Find the files within the content_backup1 and content_backup2 and compare 
> them to see if they are identical. Would recommend comparing them using each 
> of the 3 methods: sha256, sha512, diff
> 
> This should make it pretty clear that either:
> (1) the issue resides in the software: either NiFi or the JVM
> (2) the issue resides outside of the software: the disk, the disk driver, the 
> operating system, the VM hypervisor, etc.
> 
> Thanks
> -Mark
> 
>> On Nov 3, 2021, at 10:44 AM, Joe Witt <[email protected]> wrote:
>> 
>> Jens,
>> 
>> 184 hours (7.6 days) in and zero issues.
>> 
>> Will need to turn this off soon but wanted to give a final update.
>> Looks great.  Given the information on your system there appears to be
>> something we dont understand related to the virtual file system
>> involved or something.
>> 
>> Thanks
>> 
>>> On Tue, Nov 2, 2021 at 10:55 PM Jens M. Kofoed <[email protected]> 
>>> wrote:
>>> 
>>> Hi Mark
>>> 
>>> Of course, sorry :-)  By looking at the error messages, I can see that it 
>>> is only the histograms which has differences which is listed. And all 3 
>>> have the first issue at histogram.9. Don't know what that mean
>>> 
>>> /Jens
>>> 
>>> Here are the error log:
>>> 2021-11-01 23:57:21,955 ERROR [Timer-Driven Process Thread-10] 
>>> org.apache.nifi.processors.script.ExecuteScript 
>>> ExecuteScript[id=c7d3335b-1045-14ed-ffff-ffffa0d62c70] There are 
>>> differences in the histogram
>>> Byte Value: histogram.10, Previous Count: 11926720, New Count: 11926721
>>> Byte Value: histogram.100, Previous Count: 11927504, New Count: 11927503
>>> Byte Value: histogram.101, Previous Count: 11925396, New Count: 11925407
>>> Byte Value: histogram.102, Previous Count: 11929923, New Count: 11929941
>>> Byte Value: histogram.103, Previous Count: 11931596, New Count: 11931591
>>> Byte Value: histogram.104, Previous Count: 11929071, New Count: 11929064
>>> Byte Value: histogram.105, Previous Count: 11931365, New Count: 11931348
>>> Byte Value: histogram.106, Previous Count: 11928661, New Count: 11928645
>>> Byte Value: histogram.107, Previous Count: 11929864, New Count: 11929866
>>> Byte Value: histogram.108, Previous Count: 11931611, New Count: 11931642
>>> Byte Value: histogram.109, Previous Count: 11932758, New Count: 11932763
>>> Byte Value: histogram.110, Previous Count: 11927893, New Count: 11927895
>>> Byte Value: histogram.111, Previous Count: 11933519, New Count: 11933522
>>> Byte Value: histogram.112, Previous Count: 11931392, New Count: 11931397
>>> Byte Value: histogram.113, Previous Count: 11928534, New Count: 11928548
>>> Byte Value: histogram.114, Previous Count: 11936879, New Count: 11936874
>>> Byte Value: histogram.115, Previous Count: 11932818, New Count: 11932804
>>> Byte Value: histogram.117, Previous Count: 11929143, New Count: 11929151
>>> Byte Value: histogram.118, Previous Count: 11931854, New Count: 11931829
>>> Byte Value: histogram.119, Previous Count: 11926333, New Count: 11926327
>>> Byte Value: histogram.120, Previous Count: 11928731, New Count: 11928740
>>> Byte Value: histogram.121, Previous Count: 11931149, New Count: 11931162
>>> Byte Value: histogram.122, Previous Count: 11926725, New Count: 11926733
>>> Byte Value: histogram.32, Previous Count: 11930422, New Count: 11930425
>>> Byte Value: histogram.33, Previous Count: 11934311, New Count: 11934313
>>> Byte Value: histogram.34, Previous Count: 11930459, New Count: 11930446
>>> Byte Value: histogram.35, Previous Count: 11924776, New Count: 11924758
>>> Byte Value: histogram.36, Previous Count: 11924186, New Count: 11924183
>>> Byte Value: histogram.37, Previous Count: 11928616, New Count: 11928627
>>> Byte Value: histogram.38, Previous Count: 11929474, New Count: 11929490
>>> Byte Value: histogram.39, Previous Count: 11929607, New Count: 11929600
>>> Byte Value: histogram.40, Previous Count: 11928053, New Count: 11928048
>>> Byte Value: histogram.41, Previous Count: 11930402, New Count: 11930399
>>> Byte Value: histogram.42, Previous Count: 11926830, New Count: 11926846
>>> Byte Value: histogram.44, Previous Count: 11932536, New Count: 11932538
>>> Byte Value: histogram.45, Previous Count: 11931053, New Count: 11931044
>>> Byte Value: histogram.46, Previous Count: 11930008, New Count: 11930011
>>> Byte Value: histogram.47, Previous Count: 11927747, New Count: 11927734
>>> Byte Value: histogram.48, Previous Count: 11936055, New Count: 11936057
>>> Byte Value: histogram.49, Previous Count: 11931471, New Count: 11931474
>>> Byte Value: histogram.50, Previous Count: 11931921, New Count: 11931908
>>> Byte Value: histogram.51, Previous Count: 11929643, New Count: 11929637
>>> Byte Value: histogram.52, Previous Count: 11923847, New Count: 11923854
>>> Byte Value: histogram.53, Previous Count: 11927311, New Count: 11927303
>>> Byte Value: histogram.54, Previous Count: 11933754, New Count: 11933766
>>> Byte Value: histogram.55, Previous Count: 11925964, New Count: 11925970
>>> Byte Value: histogram.56, Previous Count: 11928872, New Count: 11928873
>>> Byte Value: histogram.57, Previous Count: 11931124, New Count: 11931127
>>> Byte Value: histogram.58, Previous Count: 11928474, New Count: 11928477
>>> Byte Value: histogram.59, Previous Count: 11925814, New Count: 11925812
>>> Byte Value: histogram.60, Previous Count: 11933978, New Count: 11933991
>>> Byte Value: histogram.61, Previous Count: 11934136, New Count: 11934123
>>> Byte Value: histogram.62, Previous Count: 11932016, New Count: 11932011
>>> Byte Value: histogram.63, Previous Count: 23864588, New Count: 23864584
>>> Byte Value: histogram.64, Previous Count: 11924792, New Count: 11924789
>>> Byte Value: histogram.65, Previous Count: 11934789, New Count: 11934797
>>> Byte Value: histogram.66, Previous Count: 11933047, New Count: 11933044
>>> Byte Value: histogram.67, Previous Count: 11931899, New Count: 11931909
>>> Byte Value: histogram.68, Previous Count: 11935615, New Count: 11935609
>>> Byte Value: histogram.69, Previous Count: 11927249, New Count: 11927239
>>> Byte Value: histogram.70, Previous Count: 11933276, New Count: 11933274
>>> Byte Value: histogram.71, Previous Count: 11927953, New Count: 11927969
>>> Byte Value: histogram.72, Previous Count: 11929275, New Count: 11929266
>>> Byte Value: histogram.73, Previous Count: 11930292, New Count: 11930306
>>> Byte Value: histogram.74, Previous Count: 11935428, New Count: 11935427
>>> Byte Value: histogram.75, Previous Count: 11930317, New Count: 11930307
>>> Byte Value: histogram.76, Previous Count: 11935737, New Count: 11935726
>>> Byte Value: histogram.77, Previous Count: 11932127, New Count: 11932125
>>> Byte Value: histogram.78, Previous Count: 11932344, New Count: 11932349
>>> Byte Value: histogram.79, Previous Count: 11932094, New Count: 11932100
>>> Byte Value: histogram.80, Previous Count: 11930688, New Count: 11930687
>>> Byte Value: histogram.81, Previous Count: 11928415, New Count: 11928416
>>> Byte Value: histogram.82, Previous Count: 11931559, New Count: 11931542
>>> Byte Value: histogram.83, Previous Count: 11934192, New Count: 11934176
>>> Byte Value: histogram.84, Previous Count: 11927224, New Count: 11927231
>>> Byte Value: histogram.85, Previous Count: 11929491, New Count: 11929484
>>> Byte Value: histogram.87, Previous Count: 11932201, New Count: 11932190
>>> Byte Value: histogram.88, Previous Count: 11930694, New Count: 11930680
>>> Byte Value: histogram.89, Previous Count: 11936439, New Count: 11936448
>>> Byte Value: histogram.9, Previous Count: 11933187, New Count: 11933193
>>> Byte Value: histogram.90, Previous Count: 11926445, New Count: 11926455
>>> Byte Value: histogram.94, Previous Count: 11931596, New Count: 11931609
>>> Byte Value: histogram.95, Previous Count: 11929379, New Count: 11929384
>>> Byte Value: histogram.97, Previous Count: 11928864, New Count: 11928874
>>> Byte Value: histogram.98, Previous Count: 11924738, New Count: 11924729
>>> Byte Value: histogram.99, Previous Count: 11930062, New Count: 11930059
>>> 
>>> 2021-11-01 22:10:02,765 ERROR [Timer-Driven Process Thread-9] 
>>> org.apache.nifi.processors.script.ExecuteScript 
>>> ExecuteScript[id=c7d3335b-1045-14ed-ffff-ffffa0d62c70] There are 
>>> differences in the histogram
>>> Byte Value: histogram.10, Previous Count: 11932402, New Count: 11932407
>>> Byte Value: histogram.100, Previous Count: 11927531, New Count: 11927541
>>> Byte Value: histogram.101, Previous Count: 11928454, New Count: 11928430
>>> Byte Value: histogram.102, Previous Count: 11934432, New Count: 11934439
>>> Byte Value: histogram.103, Previous Count: 11924623, New Count: 11924633
>>> Byte Value: histogram.104, Previous Count: 11934492, New Count: 11934474
>>> Byte Value: histogram.105, Previous Count: 11934585, New Count: 11934591
>>> Byte Value: histogram.106, Previous Count: 11928955, New Count: 11928948
>>> Byte Value: histogram.108, Previous Count: 11930139, New Count: 11930140
>>> Byte Value: histogram.109, Previous Count: 11929325, New Count: 11929321
>>> Byte Value: histogram.110, Previous Count: 11930486, New Count: 11930478
>>> Byte Value: histogram.111, Previous Count: 11933517, New Count: 11933508
>>> Byte Value: histogram.112, Previous Count: 11928334, New Count: 11928339
>>> Byte Value: histogram.114, Previous Count: 11929222, New Count: 11929213
>>> Byte Value: histogram.116, Previous Count: 11931182, New Count: 11931188
>>> Byte Value: histogram.117, Previous Count: 11933407, New Count: 11933402
>>> Byte Value: histogram.118, Previous Count: 11932709, New Count: 11932705
>>> Byte Value: histogram.120, Previous Count: 11933700, New Count: 11933708
>>> Byte Value: histogram.121, Previous Count: 11929803, New Count: 11929801
>>> Byte Value: histogram.122, Previous Count: 11930218, New Count: 11930220
>>> Byte Value: histogram.32, Previous Count: 11924458, New Count: 11924469
>>> Byte Value: histogram.33, Previous Count: 11934243, New Count: 11934248
>>> Byte Value: histogram.34, Previous Count: 11930696, New Count: 11930700
>>> Byte Value: histogram.35, Previous Count: 11925574, New Count: 11925577
>>> Byte Value: histogram.36, Previous Count: 11929198, New Count: 11929187
>>> Byte Value: histogram.37, Previous Count: 11928146, New Count: 11928143
>>> Byte Value: histogram.38, Previous Count: 11932505, New Count: 11932510
>>> Byte Value: histogram.39, Previous Count: 11929406, New Count: 11929412
>>> Byte Value: histogram.40, Previous Count: 11930100, New Count: 11930098
>>> Byte Value: histogram.41, Previous Count: 11930867, New Count: 11930872
>>> Byte Value: histogram.42, Previous Count: 11930796, New Count: 11930793
>>> Byte Value: histogram.43, Previous Count: 11930796, New Count: 11930789
>>> Byte Value: histogram.44, Previous Count: 11921866, New Count: 11921865
>>> Byte Value: histogram.45, Previous Count: 11935682, New Count: 11935699
>>> Byte Value: histogram.46, Previous Count: 11930075, New Count: 11930073
>>> Byte Value: histogram.47, Previous Count: 11928169, New Count: 11928165
>>> Byte Value: histogram.48, Previous Count: 11933490, New Count: 11933478
>>> Byte Value: histogram.49, Previous Count: 11932174, New Count: 11932180
>>> Byte Value: histogram.50, Previous Count: 11933255, New Count: 11933239
>>> Byte Value: histogram.51, Previous Count: 11934009, New Count: 11934013
>>> Byte Value: histogram.52, Previous Count: 11928361, New Count: 11928367
>>> Byte Value: histogram.53, Previous Count: 11927626, New Count: 11927627
>>> Byte Value: histogram.54, Previous Count: 11931611, New Count: 11931617
>>> Byte Value: histogram.55, Previous Count: 11930755, New Count: 11930746
>>> Byte Value: histogram.56, Previous Count: 11933823, New Count: 11933824
>>> Byte Value: histogram.57, Previous Count: 11922508, New Count: 11922510
>>> Byte Value: histogram.58, Previous Count: 11930384, New Count: 11930362
>>> Byte Value: histogram.59, Previous Count: 11929805, New Count: 11929820
>>> Byte Value: histogram.60, Previous Count: 11930064, New Count: 11930055
>>> Byte Value: histogram.61, Previous Count: 11926761, New Count: 11926762
>>> Byte Value: histogram.62, Previous Count: 11927605, New Count: 11927604
>>> Byte Value: histogram.63, Previous Count: 23858926, New Count: 23858913
>>> Byte Value: histogram.64, Previous Count: 11929516, New Count: 11929512
>>> Byte Value: histogram.65, Previous Count: 11930217, New Count: 11930223
>>> Byte Value: histogram.66, Previous Count: 11930478, New Count: 11930481
>>> Byte Value: histogram.67, Previous Count: 11939855, New Count: 11939858
>>> Byte Value: histogram.68, Previous Count: 11927850, New Count: 11927852
>>> Byte Value: histogram.69, Previous Count: 11931154, New Count: 11931175
>>> Byte Value: histogram.70, Previous Count: 11935374, New Count: 11935369
>>> Byte Value: histogram.71, Previous Count: 11930754, New Count: 11930751
>>> Byte Value: histogram.72, Previous Count: 11928304, New Count: 11928318
>>> Byte Value: histogram.73, Previous Count: 11931772, New Count: 11931766
>>> Byte Value: histogram.74, Previous Count: 11939417, New Count: 11939426
>>> Byte Value: histogram.75, Previous Count: 11930712, New Count: 11930718
>>> Byte Value: histogram.76, Previous Count: 11933331, New Count: 11933346
>>> Byte Value: histogram.77, Previous Count: 11931279, New Count: 11931272
>>> Byte Value: histogram.78, Previous Count: 11928276, New Count: 11928290
>>> Byte Value: histogram.79, Previous Count: 11930071, New Count: 11930067
>>> Byte Value: histogram.80, Previous Count: 11927830, New Count: 11927825
>>> Byte Value: histogram.81, Previous Count: 11931213, New Count: 11931206
>>> Byte Value: histogram.82, Previous Count: 11930964, New Count: 11930958
>>> Byte Value: histogram.83, Previous Count: 11928973, New Count: 11928966
>>> Byte Value: histogram.84, Previous Count: 11934325, New Count: 11934331
>>> Byte Value: histogram.85, Previous Count: 11929658, New Count: 11929654
>>> Byte Value: histogram.86, Previous Count: 11924667, New Count: 11924666
>>> Byte Value: histogram.87, Previous Count: 11931100, New Count: 11931106
>>> Byte Value: histogram.88, Previous Count: 11930252, New Count: 11930248
>>> Byte Value: histogram.89, Previous Count: 11927281, New Count: 11927299
>>> Byte Value: histogram.9, Previous Count: 11932848, New Count: 11932851
>>> Byte Value: histogram.90, Previous Count: 11930398, New Count: 11930399
>>> Byte Value: histogram.94, Previous Count: 11928720, New Count: 11928715
>>> Byte Value: histogram.95, Previous Count: 11928988, New Count: 11928977
>>> Byte Value: histogram.97, Previous Count: 11931423, New Count: 11931426
>>> Byte Value: histogram.98, Previous Count: 11928181, New Count: 11928184
>>> Byte Value: histogram.99, Previous Count: 11935549, New Count: 11935542
>>> 
>>> 2021-11-01 22:23:08,989 ERROR [Timer-Driven Process Thread-10] 
>>> org.apache.nifi.processors.script.ExecuteScript 
>>> ExecuteScript[id=24d13930-49e8-1062-9a2c-943118738138] There are 
>>> differences in the histogram
>>> Byte Value: histogram.10, Previous Count: 11930417, New Count: 11930411
>>> Byte Value: histogram.100, Previous Count: 11926739, New Count: 11926755
>>> Byte Value: histogram.101, Previous Count: 11930580, New Count: 11930574
>>> Byte Value: histogram.102, Previous Count: 11928210, New Count: 11928202
>>> Byte Value: histogram.103, Previous Count: 11935300, New Count: 11935297
>>> Byte Value: histogram.104, Previous Count: 11925804, New Count: 11925820
>>> Byte Value: histogram.105, Previous Count: 11931023, New Count: 11931012
>>> Byte Value: histogram.106, Previous Count: 11932342, New Count: 11932344
>>> Byte Value: histogram.108, Previous Count: 11930098, New Count: 11930106
>>> Byte Value: histogram.109, Previous Count: 11930759, New Count: 11930750
>>> Byte Value: histogram.110, Previous Count: 11934343, New Count: 11934352
>>> Byte Value: histogram.111, Previous Count: 11935775, New Count: 11935782
>>> Byte Value: histogram.112, Previous Count: 11933877, New Count: 11933884
>>> Byte Value: histogram.113, Previous Count: 11926675, New Count: 11926674
>>> Byte Value: histogram.114, Previous Count: 11929332, New Count: 11929336
>>> Byte Value: histogram.115, Previous Count: 11928876, New Count: 11928878
>>> Byte Value: histogram.116, Previous Count: 11927819, New Count: 11927833
>>> Byte Value: histogram.117, Previous Count: 11932657, New Count: 11932638
>>> Byte Value: histogram.118, Previous Count: 11933508, New Count: 11933507
>>> Byte Value: histogram.119, Previous Count: 11928808, New Count: 11928821
>>> Byte Value: histogram.120, Previous Count: 11937532, New Count: 11937528
>>> Byte Value: histogram.121, Previous Count: 11926907, New Count: 11926921
>>> Byte Value: histogram.32, Previous Count: 11929486, New Count: 11929489
>>> Byte Value: histogram.33, Previous Count: 11930737, New Count: 11930741
>>> Byte Value: histogram.34, Previous Count: 11931092, New Count: 11931086
>>> Byte Value: histogram.36, Previous Count: 11927605, New Count: 11927615
>>> Byte Value: histogram.37, Previous Count: 11930735, New Count: 11930745
>>> Byte Value: histogram.38, Previous Count: 11932174, New Count: 11932178
>>> Byte Value: histogram.39, Previous Count: 11936180, New Count: 11936182
>>> Byte Value: histogram.40, Previous Count: 11931666, New Count: 11931676
>>> Byte Value: histogram.41, Previous Count: 11927043, New Count: 11927034
>>> Byte Value: histogram.42, Previous Count: 11929044, New Count: 11929042
>>> Byte Value: histogram.43, Previous Count: 11934104, New Count: 11934098
>>> Byte Value: histogram.44, Previous Count: 11936337, New Count: 11936346
>>> Byte Value: histogram.45, Previous Count: 11935580, New Count: 11935582
>>> Byte Value: histogram.46, Previous Count: 11929598, New Count: 11929599
>>> Byte Value: histogram.47, Previous Count: 11934083, New Count: 11934085
>>> Byte Value: histogram.48, Previous Count: 11928858, New Count: 11928860
>>> Byte Value: histogram.49, Previous Count: 11931098, New Count: 11931113
>>> Byte Value: histogram.50, Previous Count: 11930618, New Count: 11930614
>>> Byte Value: histogram.51, Previous Count: 11925429, New Count: 11925435
>>> Byte Value: histogram.52, Previous Count: 11929741, New Count: 11929733
>>> Byte Value: histogram.53, Previous Count: 11934160, New Count: 11934155
>>> Byte Value: histogram.54, Previous Count: 11931999, New Count: 11931980
>>> Byte Value: histogram.55, Previous Count: 11930465, New Count: 11930477
>>> Byte Value: histogram.56, Previous Count: 11926194, New Count: 11926190
>>> Byte Value: histogram.57, Previous Count: 11926386, New Count: 11926381
>>> Byte Value: histogram.58, Previous Count: 11924871, New Count: 11924865
>>> Byte Value: histogram.59, Previous Count: 11929331, New Count: 11929326
>>> Byte Value: histogram.60, Previous Count: 11926951, New Count: 11926943
>>> Byte Value: histogram.61, Previous Count: 11928631, New Count: 11928619
>>> Byte Value: histogram.62, Previous Count: 11927549, New Count: 11927553
>>> Byte Value: histogram.63, Previous Count: 23856730, New Count: 23856718
>>> Byte Value: histogram.64, Previous Count: 11930288, New Count: 11930293
>>> Byte Value: histogram.65, Previous Count: 11931523, New Count: 11931527
>>> Byte Value: histogram.66, Previous Count: 11932821, New Count: 11932818
>>> Byte Value: histogram.67, Previous Count: 11932509, New Count: 11932510
>>> Byte Value: histogram.68, Previous Count: 11929613, New Count: 11929614
>>> Byte Value: histogram.69, Previous Count: 11928651, New Count: 11928654
>>> Byte Value: histogram.70, Previous Count: 11929253, New Count: 11929247
>>> Byte Value: histogram.71, Previous Count: 11931521, New Count: 11931512
>>> Byte Value: histogram.72, Previous Count: 11925805, New Count: 11925808
>>> Byte Value: histogram.73, Previous Count: 11934833, New Count: 11934826
>>> Byte Value: histogram.74, Previous Count: 11928314, New Count: 11928312
>>> Byte Value: histogram.75, Previous Count: 11923854, New Count: 11923863
>>> Byte Value: histogram.76, Previous Count: 11930892, New Count: 11930898
>>> Byte Value: histogram.77, Previous Count: 11927528, New Count: 11927525
>>> Byte Value: histogram.78, Previous Count: 11932850, New Count: 11932857
>>> Byte Value: histogram.79, Previous Count: 11934471, New Count: 11934461
>>> Byte Value: histogram.80, Previous Count: 11925707, New Count: 11925714
>>> Byte Value: histogram.81, Previous Count: 11929213, New Count: 11929206
>>> Byte Value: histogram.82, Previous Count: 11931334, New Count: 11931323
>>> Byte Value: histogram.83, Previous Count: 11936739, New Count: 11936732
>>> Byte Value: histogram.84, Previous Count: 11927855, New Count: 11927832
>>> Byte Value: histogram.85, Previous Count: 11931668, New Count: 11931665
>>> Byte Value: histogram.86, Previous Count: 11928609, New Count: 11928604
>>> Byte Value: histogram.87, Previous Count: 11931930, New Count: 11931933
>>> Byte Value: histogram.88, Previous Count: 11934341, New Count: 11934345
>>> Byte Value: histogram.89, Previous Count: 11927519, New Count: 11927518
>>> Byte Value: histogram.9, Previous Count: 11928004, New Count: 11928001
>>> Byte Value: histogram.90, Previous Count: 11933502, New Count: 11933517
>>> Byte Value: histogram.94, Previous Count: 11932024, New Count: 11932035
>>> Byte Value: histogram.95, Previous Count: 11932693, New Count: 11932679
>>> Byte Value: histogram.97, Previous Count: 11928428, New Count: 11928424
>>> Byte Value: histogram.98, Previous Count: 11933195, New Count: 11933196
>>> Byte Value: histogram.99, Previous Count: 11924273, New Count: 11924282
>>> 
>>>> Den tir. 2. nov. 2021 kl. 15.41 skrev Mark Payne <[email protected]>:
>>>> 
>>>> Jens,
>>>> 
>>>> The histograms, in and of themselves, are not very interesting. The 
>>>> interesting thing would be the difference in the histogram before & after 
>>>> the hash. Can you provide the ERROR level logs generated by the 
>>>> ExecuteScript? That’s what is of interest.
>>>> 
>>>> Thanks
>>>> -Mark
>>>> 
>>>> 
>>>> On Nov 2, 2021, at 1:35 AM, Jens M. Kofoed <[email protected]> wrote:
>>>> 
>>>> Hi Mark and Joe
>>>> 
>>>> Yesterday morning I implemented Mark's script in my 2 testflows. One 
>>>> testflow using sftp the other MergeContent/UnpackContent. Both testflow 
>>>> are running at a test cluster with 3 nodes and NIFI 1.14.0
>>>> The 1st flow with sftp have had 1 file going into the failure queue after 
>>>> about 16 hours.
>>>> The 2nd flow have had 2 files  going into the failure queue after about 15 
>>>> and 17 hours.
>>>> 
>>>> There are definitely something going wrongs in my setup, but I can't 
>>>> figure out what.
>>>> 
>>>> Information from file 1:
>>>> histogram.0;0
>>>> histogram.1;0
>>>> histogram.10;11926720
>>>> histogram.100;11927504
>>>> histogram.101;11925396
>>>> histogram.102;11929923
>>>> histogram.103;11931596
>>>> histogram.104;11929071
>>>> histogram.105;11931365
>>>> histogram.106;11928661
>>>> histogram.107;11929864
>>>> histogram.108;11931611
>>>> histogram.109;11932758
>>>> histogram.11;0
>>>> histogram.110;11927893
>>>> histogram.111;11933519
>>>> histogram.112;11931392
>>>> histogram.113;11928534
>>>> histogram.114;11936879
>>>> histogram.115;11932818
>>>> histogram.116;11934767
>>>> histogram.117;11929143
>>>> histogram.118;11931854
>>>> histogram.119;11926333
>>>> histogram.12;0
>>>> histogram.120;11928731
>>>> histogram.121;11931149
>>>> histogram.122;11926725
>>>> histogram.123;0
>>>> histogram.124;0
>>>> histogram.125;0
>>>> histogram.126;0
>>>> histogram.127;0
>>>> histogram.128;0
>>>> histogram.129;0
>>>> histogram.13;0
>>>> histogram.130;0
>>>> histogram.131;0
>>>> histogram.132;0
>>>> histogram.133;0
>>>> histogram.134;0
>>>> histogram.135;0
>>>> histogram.136;0
>>>> histogram.137;0
>>>> histogram.138;0
>>>> histogram.139;0
>>>> histogram.14;0
>>>> histogram.140;0
>>>> histogram.141;0
>>>> histogram.142;0
>>>> histogram.143;0
>>>> histogram.144;0
>>>> histogram.145;0
>>>> histogram.146;0
>>>> histogram.147;0
>>>> histogram.148;0
>>>> histogram.149;0
>>>> histogram.15;0
>>>> histogram.150;0
>>>> histogram.151;0
>>>> histogram.152;0
>>>> histogram.153;0
>>>> histogram.154;0
>>>> histogram.155;0
>>>> histogram.156;0
>>>> histogram.157;0
>>>> histogram.158;0
>>>> histogram.159;0
>>>> histogram.16;0
>>>> histogram.160;0
>>>> histogram.161;0
>>>> histogram.162;0
>>>> histogram.163;0
>>>> histogram.164;0
>>>> histogram.165;0
>>>> histogram.166;0
>>>> histogram.167;0
>>>> histogram.168;0
>>>> histogram.169;0
>>>> histogram.17;0
>>>> histogram.170;0
>>>> histogram.171;0
>>>> histogram.172;0
>>>> histogram.173;0
>>>> histogram.174;0
>>>> histogram.175;0
>>>> histogram.176;0
>>>> histogram.177;0
>>>> histogram.178;0
>>>> histogram.179;0
>>>> histogram.18;0
>>>> histogram.180;0
>>>> histogram.181;0
>>>> histogram.182;0
>>>> histogram.183;0
>>>> histogram.184;0
>>>> histogram.185;0
>>>> histogram.186;0
>>>> histogram.187;0
>>>> histogram.188;0
>>>> histogram.189;0
>>>> histogram.19;0
>>>> histogram.190;0
>>>> histogram.191;0
>>>> histogram.192;0
>>>> histogram.193;0
>>>> histogram.194;0
>>>> histogram.195;0
>>>> histogram.196;0
>>>> histogram.197;0
>>>> histogram.198;0
>>>> histogram.199;0
>>>> histogram.2;0
>>>> histogram.20;0
>>>> histogram.200;0
>>>> histogram.201;0
>>>> histogram.202;0
>>>> histogram.203;0
>>>> histogram.204;0
>>>> histogram.205;0
>>>> histogram.206;0
>>>> histogram.207;0
>>>> histogram.208;0
>>>> histogram.209;0
>>>> histogram.21;0
>>>> histogram.210;0
>>>> histogram.211;0
>>>> histogram.212;0
>>>> histogram.213;0
>>>> histogram.214;0
>>>> histogram.215;0
>>>> histogram.216;0
>>>> histogram.217;0
>>>> histogram.218;0
>>>> histogram.219;0
>>>> histogram.22;0
>>>> histogram.220;0
>>>> histogram.221;0
>>>> histogram.222;0
>>>> histogram.223;0
>>>> histogram.224;0
>>>> histogram.225;0
>>>> histogram.226;0
>>>> histogram.227;0
>>>> histogram.228;0
>>>> histogram.229;0
>>>> histogram.23;0
>>>> histogram.230;0
>>>> histogram.231;0
>>>> histogram.232;0
>>>> histogram.233;0
>>>> histogram.234;0
>>>> histogram.235;0
>>>> histogram.236;0
>>>> histogram.237;0
>>>> histogram.238;0
>>>> histogram.239;0
>>>> histogram.24;0
>>>> histogram.240;0
>>>> histogram.241;0
>>>> histogram.242;0
>>>> histogram.243;0
>>>> histogram.244;0
>>>> histogram.245;0
>>>> histogram.246;0
>>>> histogram.247;0
>>>> histogram.248;0
>>>> histogram.249;0
>>>> histogram.25;0
>>>> histogram.250;0
>>>> histogram.251;0
>>>> histogram.252;0
>>>> histogram.253;0
>>>> histogram.254;0
>>>> histogram.255;0
>>>> histogram.26;0
>>>> histogram.27;0
>>>> histogram.28;0
>>>> histogram.29;0
>>>> histogram.3;0
>>>> histogram.30;0
>>>> histogram.31;0
>>>> histogram.32;11930422
>>>> histogram.33;11934311
>>>> histogram.34;11930459
>>>> histogram.35;11924776
>>>> histogram.36;11924186
>>>> histogram.37;11928616
>>>> histogram.38;11929474
>>>> histogram.39;11929607
>>>> histogram.4;0
>>>> histogram.40;11928053
>>>> histogram.41;11930402
>>>> histogram.42;11926830
>>>> histogram.43;11938138
>>>> histogram.44;11932536
>>>> histogram.45;11931053
>>>> histogram.46;11930008
>>>> histogram.47;11927747
>>>> histogram.48;11936055
>>>> histogram.49;11931471
>>>> histogram.5;0
>>>> histogram.50;11931921
>>>> histogram.51;11929643
>>>> histogram.52;11923847
>>>> histogram.53;11927311
>>>> histogram.54;11933754
>>>> histogram.55;11925964
>>>> histogram.56;11928872
>>>> histogram.57;11931124
>>>> histogram.58;11928474
>>>> histogram.59;11925814
>>>> histogram.6;0
>>>> histogram.60;11933978
>>>> histogram.61;11934136
>>>> histogram.62;11932016
>>>> histogram.63;23864588
>>>> histogram.64;11924792
>>>> histogram.65;11934789
>>>> histogram.66;11933047
>>>> histogram.67;11931899
>>>> histogram.68;11935615
>>>> histogram.69;11927249
>>>> histogram.7;0
>>>> histogram.70;11933276
>>>> histogram.71;11927953
>>>> histogram.72;11929275
>>>> histogram.73;11930292
>>>> histogram.74;11935428
>>>> histogram.75;11930317
>>>> histogram.76;11935737
>>>> histogram.77;11932127
>>>> histogram.78;11932344
>>>> histogram.79;11932094
>>>> histogram.8;0
>>>> histogram.80;11930688
>>>> histogram.81;11928415
>>>> histogram.82;11931559
>>>> histogram.83;11934192
>>>> histogram.84;11927224
>>>> histogram.85;11929491
>>>> histogram.86;11930624
>>>> histogram.87;11932201
>>>> histogram.88;11930694
>>>> histogram.89;11936439
>>>> histogram.9;11933187
>>>> histogram.90;11926445
>>>> histogram.91;0
>>>> histogram.92;0
>>>> histogram.93;0
>>>> histogram.94;11931596
>>>> histogram.95;11929379
>>>> histogram.96;0
>>>> histogram.97;11928864
>>>> histogram.98;11924738
>>>> histogram.99;11930062
>>>> histogram.totalBytes;1073741824
>>>> 
>>>> File 2:
>>>> histogram.0;0
>>>> histogram.1;0
>>>> histogram.10;11932402
>>>> histogram.100;11927531
>>>> histogram.101;11928454
>>>> histogram.102;11934432
>>>> histogram.103;11924623
>>>> histogram.104;11934492
>>>> histogram.105;11934585
>>>> histogram.106;11928955
>>>> histogram.107;11928651
>>>> histogram.108;11930139
>>>> histogram.109;11929325
>>>> histogram.11;0
>>>> histogram.110;11930486
>>>> histogram.111;11933517
>>>> histogram.112;11928334
>>>> histogram.113;11927798
>>>> histogram.114;11929222
>>>> histogram.115;11932057
>>>> histogram.116;11931182
>>>> histogram.117;11933407
>>>> histogram.118;11932709
>>>> histogram.119;11931338
>>>> histogram.12;0
>>>> histogram.120;11933700
>>>> histogram.121;11929803
>>>> histogram.122;11930218
>>>> histogram.123;0
>>>> histogram.124;0
>>>> histogram.125;0
>>>> histogram.126;0
>>>> histogram.127;0
>>>> histogram.128;0
>>>> histogram.129;0
>>>> histogram.13;0
>>>> histogram.130;0
>>>> histogram.131;0
>>>> histogram.132;0
>>>> histogram.133;0
>>>> histogram.134;0
>>>> histogram.135;0
>>>> histogram.136;0
>>>> histogram.137;0
>>>> histogram.138;0
>>>> histogram.139;0
>>>> histogram.14;0
>>>> histogram.140;0
>>>> histogram.141;0
>>>> histogram.142;0
>>>> histogram.143;0
>>>> histogram.144;0
>>>> histogram.145;0
>>>> histogram.146;0
>>>> histogram.147;0
>>>> histogram.148;0
>>>> histogram.149;0
>>>> histogram.15;0
>>>> histogram.150;0
>>>> histogram.151;0
>>>> histogram.152;0
>>>> histogram.153;0
>>>> histogram.154;0
>>>> histogram.155;0
>>>> histogram.156;0
>>>> histogram.157;0
>>>> histogram.158;0
>>>> histogram.159;0
>>>> histogram.16;0
>>>> histogram.160;0
>>>> histogram.161;0
>>>> histogram.162;0
>>>> histogram.163;0
>>>> histogram.164;0
>>>> histogram.165;0
>>>> histogram.166;0
>>>> histogram.167;0
>>>> histogram.168;0
>>>> histogram.169;0
>>>> histogram.17;0
>>>> histogram.170;0
>>>> histogram.171;0
>>>> histogram.172;0
>>>> histogram.173;0
>>>> histogram.174;0
>>>> histogram.175;0
>>>> histogram.176;0
>>>> histogram.177;0
>>>> histogram.178;0
>>>> histogram.179;0
>>>> histogram.18;0
>>>> histogram.180;0
>>>> histogram.181;0
>>>> histogram.182;0
>>>> histogram.183;0
>>>> histogram.184;0
>>>> histogram.185;0
>>>> histogram.186;0
>>>> histogram.187;0
>>>> histogram.188;0
>>>> histogram.189;0
>>>> histogram.19;0
>>>> histogram.190;0
>>>> histogram.191;0
>>>> histogram.192;0
>>>> histogram.193;0
>>>> histogram.194;0
>>>> histogram.195;0
>>>> histogram.196;0
>>>> histogram.197;0
>>>> histogram.198;0
>>>> histogram.199;0
>>>> histogram.2;0
>>>> histogram.20;0
>>>> histogram.200;0
>>>> histogram.201;0
>>>> histogram.202;0
>>>> histogram.203;0
>>>> histogram.204;0
>>>> histogram.205;0
>>>> histogram.206;0
>>>> histogram.207;0
>>>> histogram.208;0
>>>> histogram.209;0
>>>> histogram.21;0
>>>> histogram.210;0
>>>> histogram.211;0
>>>> histogram.212;0
>>>> histogram.213;0
>>>> histogram.214;0
>>>> histogram.215;0
>>>> histogram.216;0
>>>> histogram.217;0
>>>> histogram.218;0
>>>> histogram.219;0
>>>> histogram.22;0
>>>> histogram.220;0
>>>> histogram.221;0
>>>> histogram.222;0
>>>> histogram.223;0
>>>> histogram.224;0
>>>> histogram.225;0
>>>> histogram.226;0
>>>> histogram.227;0
>>>> histogram.228;0
>>>> histogram.229;0
>>>> histogram.23;0
>>>> histogram.230;0
>>>> histogram.231;0
>>>> histogram.232;0
>>>> histogram.233;0
>>>> histogram.234;0
>>>> histogram.235;0
>>>> histogram.236;0
>>>> histogram.237;0
>>>> histogram.238;0
>>>> histogram.239;0
>>>> histogram.24;0
>>>> histogram.240;0
>>>> histogram.241;0
>>>> histogram.242;0
>>>> histogram.243;0
>>>> histogram.244;0
>>>> histogram.245;0
>>>> histogram.246;0
>>>> histogram.247;0
>>>> histogram.248;0
>>>> histogram.249;0
>>>> histogram.25;0
>>>> histogram.250;0
>>>> histogram.251;0
>>>> histogram.252;0
>>>> histogram.253;0
>>>> histogram.254;0
>>>> histogram.255;0
>>>> histogram.26;0
>>>> histogram.27;0
>>>> histogram.28;0
>>>> histogram.29;0
>>>> histogram.3;0
>>>> histogram.30;0
>>>> histogram.31;0
>>>> histogram.32;11924458
>>>> histogram.33;11934243
>>>> histogram.34;11930696
>>>> histogram.35;11925574
>>>> histogram.36;11929198
>>>> histogram.37;11928146
>>>> histogram.38;11932505
>>>> histogram.39;11929406
>>>> histogram.4;0
>>>> histogram.40;11930100
>>>> histogram.41;11930867
>>>> histogram.42;11930796
>>>> histogram.43;11930796
>>>> histogram.44;11921866
>>>> histogram.45;11935682
>>>> histogram.46;11930075
>>>> histogram.47;11928169
>>>> histogram.48;11933490
>>>> histogram.49;11932174
>>>> histogram.5;0
>>>> histogram.50;11933255
>>>> histogram.51;11934009
>>>> histogram.52;11928361
>>>> histogram.53;11927626
>>>> histogram.54;11931611
>>>> histogram.55;11930755
>>>> histogram.56;11933823
>>>> histogram.57;11922508
>>>> histogram.58;11930384
>>>> histogram.59;11929805
>>>> histogram.6;0
>>>> histogram.60;11930064
>>>> histogram.61;11926761
>>>> histogram.62;11927605
>>>> histogram.63;23858926
>>>> histogram.64;11929516
>>>> histogram.65;11930217
>>>> histogram.66;11930478
>>>> histogram.67;11939855
>>>> histogram.68;11927850
>>>> histogram.69;11931154
>>>> histogram.7;0
>>>> histogram.70;11935374
>>>> histogram.71;11930754
>>>> histogram.72;11928304
>>>> histogram.73;11931772
>>>> histogram.74;11939417
>>>> histogram.75;11930712
>>>> histogram.76;11933331
>>>> histogram.77;11931279
>>>> histogram.78;11928276
>>>> histogram.79;11930071
>>>> histogram.8;0
>>>> histogram.80;11927830
>>>> histogram.81;11931213
>>>> histogram.82;11930964
>>>> histogram.83;11928973
>>>> histogram.84;11934325
>>>> histogram.85;11929658
>>>> histogram.86;11924667
>>>> histogram.87;11931100
>>>> histogram.88;11930252
>>>> histogram.89;11927281
>>>> histogram.9;11932848
>>>> histogram.90;11930398
>>>> histogram.91;0
>>>> histogram.92;0
>>>> histogram.93;0
>>>> histogram.94;11928720
>>>> histogram.95;11928988
>>>> histogram.96;0
>>>> histogram.97;11931423
>>>> histogram.98;11928181
>>>> histogram.99;11935549
>>>> histogram.totalBytes;1073741824
>>>> 
>>>> File3:
>>>> histogram.0;0
>>>> histogram.1;0
>>>> histogram.10;11930417
>>>> histogram.100;11926739
>>>> histogram.101;11930580
>>>> histogram.102;11928210
>>>> histogram.103;11935300
>>>> histogram.104;11925804
>>>> histogram.105;11931023
>>>> histogram.106;11932342
>>>> histogram.107;11929778
>>>> histogram.108;11930098
>>>> histogram.109;11930759
>>>> histogram.11;0
>>>> histogram.110;11934343
>>>> histogram.111;11935775
>>>> histogram.112;11933877
>>>> histogram.113;11926675
>>>> histogram.114;11929332
>>>> histogram.115;11928876
>>>> histogram.116;11927819
>>>> histogram.117;11932657
>>>> histogram.118;11933508
>>>> histogram.119;11928808
>>>> histogram.12;0
>>>> histogram.120;11937532
>>>> histogram.121;11926907
>>>> histogram.122;11933942
>>>> histogram.123;0
>>>> histogram.124;0
>>>> histogram.125;0
>>>> histogram.126;0
>>>> histogram.127;0
>>>> histogram.128;0
>>>> histogram.129;0
>>>> histogram.13;0
>>>> histogram.130;0
>>>> histogram.131;0
>>>> histogram.132;0
>>>> histogram.133;0
>>>> histogram.134;0
>>>> histogram.135;0
>>>> histogram.136;0
>>>> histogram.137;0
>>>> histogram.138;0
>>>> histogram.139;0
>>>> histogram.14;0
>>>> histogram.140;0
>>>> histogram.141;0
>>>> histogram.142;0
>>>> histogram.143;0
>>>> histogram.144;0
>>>> histogram.145;0
>>>> histogram.146;0
>>>> histogram.147;0
>>>> histogram.148;0
>>>> histogram.149;0
>>>> histogram.15;0
>>>> histogram.150;0
>>>> histogram.151;0
>>>> histogram.152;0
>>>> histogram.153;0
>>>> histogram.154;0
>>>> histogram.155;0
>>>> histogram.156;0
>>>> histogram.157;0
>>>> histogram.158;0
>>>> histogram.159;0
>>>> histogram.16;0
>>>> histogram.160;0
>>>> histogram.161;0
>>>> histogram.162;0
>>>> histogram.163;0
>>>> histogram.164;0
>>>> histogram.165;0
>>>> histogram.166;0
>>>> histogram.167;0
>>>> histogram.168;0
>>>> histogram.169;0
>>>> histogram.17;0
>>>> histogram.170;0
>>>> histogram.171;0
>>>> histogram.172;0
>>>> histogram.173;0
>>>> histogram.174;0
>>>> histogram.175;0
>>>> histogram.176;0
>>>> histogram.177;0
>>>> histogram.178;0
>>>> histogram.179;0
>>>> histogram.18;0
>>>> histogram.180;0
>>>> histogram.181;0
>>>> histogram.182;0
>>>> histogram.183;0
>>>> histogram.184;0
>>>> histogram.185;0
>>>> histogram.186;0
>>>> histogram.187;0
>>>> histogram.188;0
>>>> histogram.189;0
>>>> histogram.19;0
>>>> histogram.190;0
>>>> histogram.191;0
>>>> histogram.192;0
>>>> histogram.193;0
>>>> histogram.194;0
>>>> histogram.195;0
>>>> histogram.196;0
>>>> histogram.197;0
>>>> histogram.198;0
>>>> histogram.199;0
>>>> histogram.2;0
>>>> histogram.20;0
>>>> histogram.200;0
>>>> histogram.201;0
>>>> histogram.202;0
>>>> histogram.203;0
>>>> histogram.204;0
>>>> histogram.205;0
>>>> histogram.206;0
>>>> histogram.207;0
>>>> histogram.208;0
>>>> histogram.209;0
>>>> histogram.21;0
>>>> histogram.210;0
>>>> histogram.211;0
>>>> histogram.212;0
>>>> histogram.213;0
>>>> histogram.214;0
>>>> histogram.215;0
>>>> histogram.216;0
>>>> histogram.217;0
>>>> histogram.218;0
>>>> histogram.219;0
>>>> histogram.22;0
>>>> histogram.220;0
>>>> histogram.221;0
>>>> histogram.222;0
>>>> histogram.223;0
>>>> histogram.224;0
>>>> histogram.225;0
>>>> histogram.226;0
>>>> histogram.227;0
>>>> histogram.228;0
>>>> histogram.229;0
>>>> histogram.23;0
>>>> histogram.230;0
>>>> histogram.231;0
>>>> histogram.232;0
>>>> histogram.233;0
>>>> histogram.234;0
>>>> histogram.235;0
>>>> histogram.236;0
>>>> histogram.237;0
>>>> histogram.238;0
>>>> histogram.239;0
>>>> histogram.24;0
>>>> histogram.240;0
>>>> histogram.241;0
>>>> histogram.242;0
>>>> histogram.243;0
>>>> histogram.244;0
>>>> histogram.245;0
>>>> histogram.246;0
>>>> histogram.247;0
>>>> histogram.248;0
>>>> histogram.249;0
>>>> histogram.25;0
>>>> histogram.250;0
>>>> histogram.251;0
>>>> histogram.252;0
>>>> histogram.253;0
>>>> histogram.254;0
>>>> histogram.255;0
>>>> histogram.26;0
>>>> histogram.27;0
>>>> histogram.28;0
>>>> histogram.29;0
>>>> histogram.3;0
>>>> histogram.30;0
>>>> histogram.31;0
>>>> histogram.32;11929486
>>>> histogram.33;11930737
>>>> histogram.34;11931092
>>>> histogram.35;11934488
>>>> histogram.36;11927605
>>>> histogram.37;11930735
>>>> histogram.38;11932174
>>>> histogram.39;11936180
>>>> histogram.4;0
>>>> histogram.40;11931666
>>>> histogram.41;11927043
>>>> histogram.42;11929044
>>>> histogram.43;11934104
>>>> histogram.44;11936337
>>>> histogram.45;11935580
>>>> histogram.46;11929598
>>>> histogram.47;11934083
>>>> histogram.48;11928858
>>>> histogram.49;11931098
>>>> histogram.5;0
>>>> histogram.50;11930618
>>>> histogram.51;11925429
>>>> histogram.52;11929741
>>>> histogram.53;11934160
>>>> histogram.54;11931999
>>>> histogram.55;11930465
>>>> histogram.56;11926194
>>>> histogram.57;11926386
>>>> histogram.58;11924871
>>>> histogram.59;11929331
>>>> histogram.6;0
>>>> histogram.60;11926951
>>>> histogram.61;11928631
>>>> histogram.62;11927549
>>>> histogram.63;23856730
>>>> histogram.64;11930288
>>>> histogram.65;11931523
>>>> histogram.66;11932821
>>>> histogram.67;11932509
>>>> histogram.68;11929613
>>>> histogram.69;11928651
>>>> histogram.7;0
>>>> histogram.70;11929253
>>>> histogram.71;11931521
>>>> histogram.72;11925805
>>>> histogram.73;11934833
>>>> histogram.74;11928314
>>>> histogram.75;11923854
>>>> histogram.76;11930892
>>>> histogram.77;11927528
>>>> histogram.78;11932850
>>>> histogram.79;11934471
>>>> histogram.8;0
>>>> histogram.80;11925707
>>>> histogram.81;11929213
>>>> histogram.82;11931334
>>>> histogram.83;11936739
>>>> histogram.84;11927855
>>>> histogram.85;11931668
>>>> histogram.86;11928609
>>>> histogram.87;11931930
>>>> histogram.88;11934341
>>>> histogram.89;11927519
>>>> histogram.9;11928004
>>>> histogram.90;11933502
>>>> histogram.91;0
>>>> histogram.92;0
>>>> histogram.93;0
>>>> histogram.94;11932024
>>>> histogram.95;11932693
>>>> histogram.96;0
>>>> histogram.97;11928428
>>>> histogram.98;11933195
>>>> histogram.99;11924273
>>>> histogram.totalBytes;1073741824
>>>> 
>>>> Kind regards
>>>> Jens
>>>> 
>>>>> Den søn. 31. okt. 2021 kl. 21.40 skrev Joe Witt <[email protected]>:
>>>>> 
>>>>> Jen
>>>>> 
>>>>> 118 hours in - still goood.
>>>>> 
>>>>> Thanks
>>>>> 
>>>>>> On Fri, Oct 29, 2021 at 10:22 AM Joe Witt <[email protected]> wrote:
>>>>>> 
>>>>>> Jens
>>>>>> 
>>>>>> Update from hour 67.  Still lookin' good.
>>>>>> 
>>>>>> Will advise.
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>>> On Thu, Oct 28, 2021 at 8:08 AM Jens M. Kofoed <[email protected]> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Many many thanks 🙏 Joe for looking into this. My test flow was running 
>>>>>>> for 6 days before the first error occurred
>>>>>>> 
>>>>>>> Thanks
>>>>>>> 
>>>>>>>> Den 28. okt. 2021 kl. 16.57 skrev Joe Witt <[email protected]>:
>>>>>>>> 
>>>>>>>> Jens,
>>>>>>>> 
>>>>>>>> Am 40+ hours in running both your flow and mine to reproduce.  So far
>>>>>>>> neither have shown any sign of trouble.  Will keep running for another
>>>>>>>> week or so if I can.
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> 
>>>>>>>>> On Wed, Oct 27, 2021 at 12:42 PM Jens M. Kofoed 
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> The Physical hosts with VMWare is using the vmfs but the vm machines 
>>>>>>>>> running at hosts can’t see that.
>>>>>>>>> But you asked about the underlying file system 😀 and since my first 
>>>>>>>>> answer with the copy from the fstab file wasn’t enough I just wanted 
>>>>>>>>> to give all the details 😁.
>>>>>>>>> 
>>>>>>>>> If you create a vm for windows you would probably use NTFS (on top of 
>>>>>>>>> vmfs). For Linux EXT3, EXT4, BTRFS, XFS and so on.
>>>>>>>>> 
>>>>>>>>> All the partitions at my nifi nodes, are local devices (sda, sdb, sdc 
>>>>>>>>> and sdd) for each Linux machine. I don’t use nfs
>>>>>>>>> 
>>>>>>>>> Kind regards
>>>>>>>>> Jens
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Den 27. okt. 2021 kl. 17.47 skrev Joe Witt <[email protected]>:
>>>>>>>>> 
>>>>>>>>> Jens,
>>>>>>>>> 
>>>>>>>>> I don't quite follow the EXT4 usage on top of VMFS but the point here
>>>>>>>>> is you'll ultimately need to truly understand your underlying storage
>>>>>>>>> system and what sorts of guarantees it is giving you.  If linux/the
>>>>>>>>> jvm/nifi think it has a typical EXT4 type block storage system to work
>>>>>>>>> with it can only be safe/operate within those constraints.  I have no
>>>>>>>>> idea about what VMFS brings to the table or the settings for it.
>>>>>>>>> 
>>>>>>>>> The sync properties I shared previously might help force the issue of
>>>>>>>>> ensuring a formal sync/flush cycle all the way through the disk has
>>>>>>>>> occurred which we'd normally not do or need to do but again in some
>>>>>>>>> cases offers a stronger guarantee in exchange for performance.
>>>>>>>>> 
>>>>>>>>> In any case...Mark's path for you here will help identify what we're
>>>>>>>>> dealing with and we can go from there.
>>>>>>>>> 
>>>>>>>>> I am aware of significant usage of NiFi on VMWare configurations
>>>>>>>>> without issue at high rates for many years so whatever it is here is
>>>>>>>>> likely solvable.
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> 
>>>>>>>>> On Wed, Oct 27, 2021 at 7:28 AM Jens M. Kofoed 
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi Mark
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks for the clarification. I will implement the script when I 
>>>>>>>>> return to the office at Monday next week ( November 1st).
>>>>>>>>> 
>>>>>>>>> I don’t use NFS, but ext4. But I will implement the script so we can 
>>>>>>>>> check if it’s the case here. But I think the issue might be after the 
>>>>>>>>> processors writing content to the repository.
>>>>>>>>> 
>>>>>>>>> I have a test flow running for more than 2 weeks without any errors. 
>>>>>>>>> But this flow only calculate hash and comparing.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Two other flows both create errors. One flow use 
>>>>>>>>> PutSFTP->FetchSFTP->CryptographicHashContent->compares. The other 
>>>>>>>>> flow use 
>>>>>>>>> MergeContent->UnpackContent->CryptographicHashContent->compares. The 
>>>>>>>>> last flow is totally inside nifi, excluding other network/server 
>>>>>>>>> issues.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> In both cases the CryptographicHashContent is right after a process 
>>>>>>>>> which writes new content to the repository. But in one case a file in 
>>>>>>>>> our production flow did calculate a wrong hash 4 times with a 1 
>>>>>>>>> minutes delay between each calculation. A few hours later I looped 
>>>>>>>>> the file back and this time it was OK.
>>>>>>>>> 
>>>>>>>>> Just like the case in step 5 and 12 in the pdf file
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I will let you all know more later next week
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Kind regards
>>>>>>>>> 
>>>>>>>>> Jens
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Den 27. okt. 2021 kl. 15.43 skrev Mark Payne <[email protected]>:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> And the actual script:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> import org.apache.nifi.flowfile.FlowFile
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> import java.util.stream.Collectors
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Map<String, String> getPreviousHistogram(final FlowFile flowFile) {
>>>>>>>>> 
>>>>>>>>> final Map<String, String> histogram = 
>>>>>>>>> flowFile.getAttributes().entrySet().stream()
>>>>>>>>> 
>>>>>>>>>     .filter({ entry -> entry.getKey().startsWith("histogram.") })
>>>>>>>>> 
>>>>>>>>>     .collect(Collectors.toMap({ entry -> entry.key}, { entry -> 
>>>>>>>>> entry.value }))
>>>>>>>>> 
>>>>>>>>> return histogram;
>>>>>>>>> 
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Map<String, String> createHistogram(final FlowFile flowFile, final 
>>>>>>>>> InputStream inStream) {
>>>>>>>>> 
>>>>>>>>> final Map<String, String> histogram = new HashMap<>();
>>>>>>>>> 
>>>>>>>>> final int[] distribution = new int[256];
>>>>>>>>> 
>>>>>>>>> Arrays.fill(distribution, 0);
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> long total = 0L;
>>>>>>>>> 
>>>>>>>>> final byte[] buffer = new byte[8192];
>>>>>>>>> 
>>>>>>>>> int len;
>>>>>>>>> 
>>>>>>>>> while ((len = inStream.read(buffer)) > 0) {
>>>>>>>>> 
>>>>>>>>>     for (int i=0; i < len; i++) {
>>>>>>>>> 
>>>>>>>>>         final int val = buffer[i];
>>>>>>>>> 
>>>>>>>>>         distribution[val]++;
>>>>>>>>> 
>>>>>>>>>         total++;
>>>>>>>>> 
>>>>>>>>>     }
>>>>>>>>> 
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> for (int i=0; i < 256; i++) {
>>>>>>>>> 
>>>>>>>>>     histogram.put("histogram." + i, String.valueOf(distribution[i]));
>>>>>>>>> 
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> histogram.put("histogram.totalBytes", String.valueOf(total));
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> return histogram;
>>>>>>>>> 
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> void logHistogramDifferences(final Map<String, String> previous, 
>>>>>>>>> final Map<String, String> updated) {
>>>>>>>>> 
>>>>>>>>> final StringBuilder sb = new StringBuilder("There are differences in 
>>>>>>>>> the histogram\n");
>>>>>>>>> 
>>>>>>>>> final Map<String, String> sorted = new TreeMap<>(previous)
>>>>>>>>> 
>>>>>>>>> for (final Map.Entry<String, String> entry : sorted.entrySet()) {
>>>>>>>>> 
>>>>>>>>>     final String key = entry.getKey();
>>>>>>>>> 
>>>>>>>>>     final String previousValue = entry.getValue();
>>>>>>>>> 
>>>>>>>>>     final String updatedValue = updated.get(entry.getKey())
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>     if (!Objects.equals(previousValue, updatedValue)) {
>>>>>>>>> 
>>>>>>>>>         sb.append("Byte Value: ").append(key).append(", Previous 
>>>>>>>>> Count: ").append(previousValue).append(", New Count: 
>>>>>>>>> ").append(updatedValue).append("\n");
>>>>>>>>> 
>>>>>>>>>     }
>>>>>>>>> 
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> log.error(sb.toString());
>>>>>>>>> 
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> def flowFile = session.get()
>>>>>>>>> 
>>>>>>>>> if (flowFile == null) {
>>>>>>>>> 
>>>>>>>>> return
>>>>>>>>> 
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> final Map<String, String> previousHistogram = 
>>>>>>>>> getPreviousHistogram(flowFile)
>>>>>>>>> 
>>>>>>>>> Map<String, String> histogram = null;
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> final InputStream inStream = session.read(flowFile);
>>>>>>>>> 
>>>>>>>>> try {
>>>>>>>>> 
>>>>>>>>> histogram = createHistogram(flowFile, inStream);
>>>>>>>>> 
>>>>>>>>> } finally {
>>>>>>>>> 
>>>>>>>>> inStream.close()
>>>>>>>>> 
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> if (!previousHistogram.isEmpty()) {
>>>>>>>>> 
>>>>>>>>> if (previousHistogram.equals(histogram)) {
>>>>>>>>> 
>>>>>>>>>     log.info("Histograms match")
>>>>>>>>> 
>>>>>>>>> } else {
>>>>>>>>> 
>>>>>>>>>     logHistogramDifferences(previousHistogram, histogram)
>>>>>>>>> 
>>>>>>>>>     session.transfer(flowFile, REL_FAILURE)
>>>>>>>>> 
>>>>>>>>>     return;
>>>>>>>>> 
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> flowFile = session.putAllAttributes(flowFile, histogram)
>>>>>>>>> 
>>>>>>>>> session.transfer(flowFile, REL_SUCCESS)
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Oct 27, 2021, at 9:43 AM, Mark Payne <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Jens,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> For a bit of background here, the reason that Joe and I have 
>>>>>>>>> expressed interest in NFS file systems is that the way the protocol 
>>>>>>>>> works, it is allowed to receive packets/chunks of the file 
>>>>>>>>> out-of-order. So, what happens is let’s say a 1 MB file is being 
>>>>>>>>> written. The first 500 KB are received. Then instead of the the 501st 
>>>>>>>>> KB it receives the 503rd KB. What happens is that the size of the 
>>>>>>>>> file on the file system becomes 503 KB. But what about 501 & 502? 
>>>>>>>>> Well when you read the data, the file system just returns ASCII NUL 
>>>>>>>>> characters (byte 0) for those bytes. Once the NFS server receives 
>>>>>>>>> those bytes, it then goes back and fills in the proper bytes. So if 
>>>>>>>>> you’re running on NFS, it is possible for the contents of the file on 
>>>>>>>>> the underlying file system to change out from under you. It’s not 
>>>>>>>>> clear to me what other types of file system might do something 
>>>>>>>>> similar.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> So, one thing that we can do is to find out whether or not the 
>>>>>>>>> contents of the underlying file have changed in some way, or if 
>>>>>>>>> there’s something else happening that could perhaps result in the 
>>>>>>>>> hashes being wrong. I’ve put together a script that should help 
>>>>>>>>> diagnose this.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Can you insert an ExecuteScript processor either just before or just 
>>>>>>>>> after your CryptographicHashContent processor? Doesn’t really matter 
>>>>>>>>> whether it’s run just before or just after. I’ll attach the script 
>>>>>>>>> here. It’s a Groovy Script so you should be able to use ExecuteScript 
>>>>>>>>> with Script Engine = Groovy and the following script as the Script 
>>>>>>>>> Body. No other changes needed.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> The way the script works, it reads in the contents of the FlowFile, 
>>>>>>>>> and then it builds up a histogram of all byte values (0-255) that it 
>>>>>>>>> sees in the contents, and then adds that as attributes. So it adds 
>>>>>>>>> attributes such as:
>>>>>>>>> 
>>>>>>>>> histogram.0 = 280273
>>>>>>>>> 
>>>>>>>>> histogram.1 = 2820
>>>>>>>>> 
>>>>>>>>> histogram.2 = 48202
>>>>>>>>> 
>>>>>>>>> histogram.3 = 3820
>>>>>>>>> 
>>>>>>>>> …
>>>>>>>>> 
>>>>>>>>> histogram.totalBytes = 1780928732
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> It then checks if those attributes have already been added. If so, 
>>>>>>>>> after calculating that histogram, it checks against the previous 
>>>>>>>>> values (in the attributes). If they are the same, the FlowFile goes 
>>>>>>>>> to ’success’. If they are different, it logs an error indicating the 
>>>>>>>>> before/after value for any byte whose distribution was different, and 
>>>>>>>>> it routes to failure.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> So, if for example, the first time through it sees 280,273 bytes with 
>>>>>>>>> a value of ‘0’, and the second times it only sees 12,001 then we know 
>>>>>>>>> there were a bunch of 0’s previously that were updated to be some 
>>>>>>>>> other value. And it includes the total number of bytes in case 
>>>>>>>>> somehow we find that we’re reading too many bytes or not enough bytes 
>>>>>>>>> or something like that. This should help narrow down what’s happening.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> 
>>>>>>>>> -Mark
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Oct 26, 2021, at 6:25 PM, Joe Witt <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Jens
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Attached is the flow I was using (now running yours and this one).  
>>>>>>>>> Curious if that one reproduces the issue for you as well.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Tue, Oct 26, 2021 at 3:09 PM Joe Witt <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Jens
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I have your flow running and will keep it running for several 
>>>>>>>>> days/week to see if I can reproduce.  Also of note please use your 
>>>>>>>>> same test flow but use HashContent instead of crypto hash.  Curious 
>>>>>>>>> if that matters for any reason...
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Still want to know more about your underlying storage system.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> You could also try updating nifi.properties and changing the 
>>>>>>>>> following lines:
>>>>>>>>> 
>>>>>>>>> nifi.flowfile.repository.always.sync=true
>>>>>>>>> 
>>>>>>>>> nifi.content.repository.always.sync=true
>>>>>>>>> 
>>>>>>>>> nifi.provenance.repository.always.sync=true
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> It will hurt performance but can be useful/necessary on certain 
>>>>>>>>> storage subsystems.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Tue, Oct 26, 2021 at 12:05 PM Joe Witt <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Ignore "For the scenario where you can replicate this please share 
>>>>>>>>> the flow.xml.gz for which it is reproducible."  I see the uploaded 
>>>>>>>>> JSON
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Tue, Oct 26, 2021 at 12:04 PM Joe Witt <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Jens,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> We asked about the underlying storage system.  You replied with some 
>>>>>>>>> info but not the specifics.  Do you know precisely what the 
>>>>>>>>> underlying storage is and how it is presented to the operating 
>>>>>>>>> system?  For instance is it NFS or something similar?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I've setup a very similar flow at extremely high rates running for 
>>>>>>>>> the past several days with no issue.  In my case though I know 
>>>>>>>>> precisely what the config is and the disk setup is.  Didn't do 
>>>>>>>>> anything special to be clear but still it is important to know.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> For the scenario where you can replicate this please share the 
>>>>>>>>> flow.xml.gz for which it is reproducible.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> 
>>>>>>>>> Joe
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Sun, Oct 24, 2021 at 9:53 PM Jens M. Kofoed 
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Dear Joe and Mark
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I have created a test flow without the sftp processors, which don't 
>>>>>>>>> create any errors. Therefore I created a new test flow where I use a 
>>>>>>>>> MergeContent and UnpackContent instead of the sftp processors. This 
>>>>>>>>> keeps all data internal in NIFI, but force NIFI to write and read new 
>>>>>>>>> files totally local.
>>>>>>>>> 
>>>>>>>>> My flow have been running for 7 days and this morning there where 2 
>>>>>>>>> files where the sha256 has been given another has value than 
>>>>>>>>> original. I have set this flow up in another nifi cluster only for 
>>>>>>>>> testing, and the cluster is not doing anything else. It is using Nifi 
>>>>>>>>> 1.14.0
>>>>>>>>> 
>>>>>>>>> So I can reproduce issues at different nifi clusters and versions 
>>>>>>>>> (1.13.2 and 1.14.0) where the calculation of a hash on content can 
>>>>>>>>> give different outputs. Is doesn't make any sense, but it happens. In 
>>>>>>>>> all my cases the issues happens where the calculations of the 
>>>>>>>>> hashcontent happens right after NIFI writes the content to the 
>>>>>>>>> content repository. I don't know if there cut be some kind of delay 
>>>>>>>>> writing the content 100% before the next processors begin reading the 
>>>>>>>>> content???
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Please see attach test flow, and the previous mail with a pdf showing 
>>>>>>>>> the lineage of a production file which also had issues. In the pdf 
>>>>>>>>> check step 5 and 12.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Kind regards
>>>>>>>>> 
>>>>>>>>> Jens M. Kofoed
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Den tor. 21. okt. 2021 kl. 08.28 skrev Jens M. Kofoed 
>>>>>>>>> <[email protected]>:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Joe,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> To start from the last mail :-)
>>>>>>>>> 
>>>>>>>>> All the repositories has it's own disk, and I'm using ext4
>>>>>>>>> 
>>>>>>>>> /dev/VG_b/LV_b    /nifiRepo    ext4    defaults,noatime    0 0
>>>>>>>>> 
>>>>>>>>> /dev/VG_c/LV_c    /provRepo01    ext4    defaults,noatime    0 0
>>>>>>>>> 
>>>>>>>>> /dev/VG_d/LV_d    /contRepo01    ext4    defaults,noatime    0 0
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> My test flow WITH sftp looks like this:
>>>>>>>>> 
>>>>>>>>> <image.png>
>>>>>>>>> 
>>>>>>>>> And this flow has produced 1 error within 3 days. After many many 
>>>>>>>>> loops the file fails and went out via the "unmatched" output to  the 
>>>>>>>>> disabled UpdateAttribute, which is doing nothing. Just for keeping 
>>>>>>>>> the failed flowfile in a queue.  I enabled the UpdateAttribute and 
>>>>>>>>> looped the file back to the CryptographicHashContent and now it 
>>>>>>>>> calculated the hash correct again. But in this flow I have a 
>>>>>>>>> FetchSFTP Process right before the Hashing.
>>>>>>>>> 
>>>>>>>>> Right now my flow is running without the 2 sftp processors, and the 
>>>>>>>>> last 24hours there has been no errors.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> About the Lineage:
>>>>>>>>> 
>>>>>>>>> Are there a way to export all the lineage data? The export only 
>>>>>>>>> generate a svg file.
>>>>>>>>> 
>>>>>>>>> This is only for the receiving nifi which is internally calculate 2 
>>>>>>>>> different hashes on the same content with ca. 1 minutes delay. 
>>>>>>>>> Attached is a pdf-document with the lineage, the flow and all the 
>>>>>>>>> relevant Provenance information's for each step in the lineage.
>>>>>>>>> 
>>>>>>>>> The interesting steps are step 5 and 12.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Can the issues be that data is not written 100% to disk between step 
>>>>>>>>> 4 and 5 in the flow?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Kind regards
>>>>>>>>> 
>>>>>>>>> Jens M. Kofoed
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Den ons. 20. okt. 2021 kl. 23.49 skrev Joe Witt <[email protected]>:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Jens,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Also what type of file system/storage system are you running NiFi on
>>>>>>>>> 
>>>>>>>>> in this case?  We'll need to know this for the NiFi
>>>>>>>>> 
>>>>>>>>> content/flowfile/provenance repositories? Is it NFS?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Wed, Oct 20, 2021 at 11:14 AM Joe Witt <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Jens,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> And to further narrow this down
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> "I have a test flow, where a GenerateFlowfile has created 6x 1GB files
>>>>>>>>> 
>>>>>>>>> (2 files per node) and next process was a hashcontent before it run
>>>>>>>>> 
>>>>>>>>> into a test loop. Where files are uploaded via PutSFTP to a test
>>>>>>>>> 
>>>>>>>>> server, and downloaded again and recalculated the hash. I have had one
>>>>>>>>> 
>>>>>>>>> issue after 3 days of running."
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> So to be clear with GenerateFlowFile making these files and then you
>>>>>>>>> 
>>>>>>>>> looping the content is wholly and fully exclusively within the control
>>>>>>>>> 
>>>>>>>>> of NiFI.  No Get/Fetch/Put-SFTP of any kind at all. In by looping the
>>>>>>>>> 
>>>>>>>>> same files over and over in nifi itself you can make this happen or
>>>>>>>>> 
>>>>>>>>> cannot?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Wed, Oct 20, 2021 at 11:08 AM Joe Witt <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Jens,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> "After fetching a FlowFile-stream file and unpacked it back into NiFi
>>>>>>>>> 
>>>>>>>>> I calculate a sha256. 1 minutes later I recalculate the sha256 on the
>>>>>>>>> 
>>>>>>>>> exact same file. And got a new hash. That is what worry’s me.
>>>>>>>>> 
>>>>>>>>> The fact that the same file can be recalculated and produce two
>>>>>>>>> 
>>>>>>>>> different hashes, is very strange, but it happens. "
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Ok so to confirm you are saying that in each case this happens you see
>>>>>>>>> 
>>>>>>>>> it first compute the wrong hash, but then if you retry the same
>>>>>>>>> 
>>>>>>>>> flowfile it then provides the correct hash?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Can you please also show/share the lineage history for such a flow
>>>>>>>>> 
>>>>>>>>> file then?  It should have events for the initial hash, second hash,
>>>>>>>>> 
>>>>>>>>> the unpacking, trace to the original stream, etc...
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Wed, Oct 20, 2021 at 11:00 AM Jens M. Kofoed 
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Dear Mark and Joe
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I know my setup isn’t normal for many people. But if we only looks at 
>>>>>>>>> my receive side, which the last mails is about. Every thing is 
>>>>>>>>> happening at the same NIFI instance. It is the same 3 node NIFI 
>>>>>>>>> cluster.
>>>>>>>>> 
>>>>>>>>> After fetching a FlowFile-stream file and unpacked it back into NiFi 
>>>>>>>>> I calculate a sha256. 1 minutes later I recalculate the sha256 on the 
>>>>>>>>> exact same file. And got a new hash. That is what worry’s me.
>>>>>>>>> 
>>>>>>>>> The fact that the same file can be recalculated and produce two 
>>>>>>>>> different hashes, is very strange, but it happens. Over the last 5 
>>>>>>>>> months it have only happen 35-40 times.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I can understand if the file is not completely loaded and saved into 
>>>>>>>>> the content repository before the hashing starts. But I believe that 
>>>>>>>>> the unpack process don’t forward the flow file to the next process 
>>>>>>>>> before it is 100% finish unpacking and saving the new content to the 
>>>>>>>>> repository.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I have a test flow, where a GenerateFlowfile has created 6x 1GB files 
>>>>>>>>> (2 files per node) and next process was a hashcontent before it run 
>>>>>>>>> into a test loop. Where files are uploaded via PutSFTP to a test 
>>>>>>>>> server, and downloaded again and recalculated the hash. I have had 
>>>>>>>>> one issue after 3 days of running.
>>>>>>>>> 
>>>>>>>>> Now the test flow is running without the Put/Fetch sftp processors.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Another problem is that I can’t find any correlation to other events. 
>>>>>>>>> Not within NIFI, nor the server itself or VMWare. If I just could 
>>>>>>>>> find any other event which happens at the same time, I might be able 
>>>>>>>>> to force some kind of event to trigger the issue.
>>>>>>>>> 
>>>>>>>>> I have tried to force VMware to migrate a NiFi node to another host. 
>>>>>>>>> Forcing it to do a snapshot and deleting snapshots, but nothing can 
>>>>>>>>> trigger and error.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I know it will be very very difficult to reproduce. But I will setup 
>>>>>>>>> multiple NiFi instances running different test flows to see if I can 
>>>>>>>>> find any reason why it behaves as it does.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Kind Regards
>>>>>>>>> 
>>>>>>>>> Jens M. Kofoed
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Den 20. okt. 2021 kl. 16.39 skrev Mark Payne <[email protected]>:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Jens,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks for sharing the images.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I tried to setup a test to reproduce the issue. I’ve had it running 
>>>>>>>>> for quite some time. Running through millions of iterations.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I’ve used 5 KB files, 50 KB files, 50 MB files, and larger (to the 
>>>>>>>>> tune of hundreds of MB). I’ve been unable to reproduce an issue after 
>>>>>>>>> millions of iterations.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> So far I cannot replicate. And since you’re pulling the data via SFTP 
>>>>>>>>> and then unpacking, which preserves all original attributes from a 
>>>>>>>>> different system, this can easily become confusing.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Recommend trying to reproduce with SFTP-related processors out of the 
>>>>>>>>> picture, as Joe is mentioning. Either using GetFile/FetchFile or 
>>>>>>>>> GenerateFlowFile. Then immediately use CryptographicHashContent to 
>>>>>>>>> generate an ‘initial hash’, copy that value to another attribute, and 
>>>>>>>>> then loop, generating the hash and comparing against the original 
>>>>>>>>> one. I’ll attach a flow that does this, but not sure if the email 
>>>>>>>>> server will strip out the attachment or not.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> This way we remove any possibility of actual corruption between the 
>>>>>>>>> two nifi instances. If we can still see corruption / different hashes 
>>>>>>>>> within a single nifi instance, then it certainly warrants further 
>>>>>>>>> investigation but i can’t see any issues so far.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> 
>>>>>>>>> -Mark
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Oct 20, 2021, at 10:21 AM, Joe Witt <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Jens
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Actually is this current loop test contained within a single nifi and 
>>>>>>>>> there you see corruption happen?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Joe
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Wed, Oct 20, 2021 at 7:14 AM Joe Witt <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Jens,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> You have a very involved setup including other systems (non NiFi).  
>>>>>>>>> Have you removed those systems from the equation so you have more 
>>>>>>>>> evidence to support your expectation that NiFi is doing something 
>>>>>>>>> other than you expect?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Joe
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Wed, Oct 20, 2021 at 7:10 AM Jens M. Kofoed 
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Today I have another file which have been running through the retry 
>>>>>>>>> loop one time. To test the processors and the algorithm I added the 
>>>>>>>>> HashContent processor and also added hashing by SHA-1.
>>>>>>>>> 
>>>>>>>>> I file have been going through the system, and both the SHA-1 and 
>>>>>>>>> SHA-256 are both different than expected. with a 1 minutes delay the 
>>>>>>>>> file is going back into the hashing content flow and this time it 
>>>>>>>>> calculates both hashes fine.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I don't believe that the hashing is buggy, but something is very very 
>>>>>>>>> strange. What can influence the processors/algorithm to calculate a 
>>>>>>>>> different hash???
>>>>>>>>> 
>>>>>>>>> All the input/output claim information is exactly the same. It is the 
>>>>>>>>> same flow/content file going in a loop. It happens on all 3 nodes.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Any suggestions for where to dig ?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Regards
>>>>>>>>> 
>>>>>>>>> Jens M. Kofoed
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Den ons. 20. okt. 2021 kl. 06.34 skrev Jens M. Kofoed 
>>>>>>>>> <[email protected]>:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi Mark
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks for replaying and the suggestion to look at the content Claim.
>>>>>>>>> 
>>>>>>>>> These 3 pictures is from the first attempt:
>>>>>>>>> 
>>>>>>>>> <image.png>   <image.png>   <image.png>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Yesterday I realized that the content was still in the archive, so I 
>>>>>>>>> could Replay the file.
>>>>>>>>> 
>>>>>>>>> <image.png>
>>>>>>>>> 
>>>>>>>>> So here are the same pictures but for the replay and as you can see 
>>>>>>>>> the Identifier, offset and Size are all the same.
>>>>>>>>> 
>>>>>>>>> <image.png>   <image.png>   <image.png>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> In my flow if the hash does not match my original first calculated 
>>>>>>>>> hash, it goes into a retry loop. Here are the pictures for the 4th 
>>>>>>>>> time the file went through:
>>>>>>>>> 
>>>>>>>>> <image.png>   <image.png>   <image.png>
>>>>>>>>> 
>>>>>>>>> Here the content Claim is all the same.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> It is very rare that we see these issues <1 : 1.000.000 files and 
>>>>>>>>> only with large files. Only once have I seen the error with a 110MB 
>>>>>>>>> file, the other times the files size are above 800MB.
>>>>>>>>> 
>>>>>>>>> This time it was a Nifi-Flowstream v3 file, which has been exported 
>>>>>>>>> from one system and imported in another. But while the file has been 
>>>>>>>>> imported it is the same file inside NIFI and it stays at the same 
>>>>>>>>> node. Going through the same loop of processors multiple times and in 
>>>>>>>>> the end the CryptographicHashContent calculate a different SHA256 
>>>>>>>>> than it did earlier. This should not be possible!!! And that is what 
>>>>>>>>> concern my the most.
>>>>>>>>> 
>>>>>>>>> What can influence the same processor to calculate 2 different sha256 
>>>>>>>>> on the exact same content???
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Regards
>>>>>>>>> 
>>>>>>>>> Jens M. Kofoed
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Den tir. 19. okt. 2021 kl. 16.51 skrev Mark Payne 
>>>>>>>>> <[email protected]>:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Jens,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> In the two provenance events - one showing a hash of dd4cc… and the 
>>>>>>>>> other showing f6f0….
>>>>>>>>> 
>>>>>>>>> If you go to the Content tab, do they both show the same Content 
>>>>>>>>> Claim? I.e., do the Input Claim / Output Claim show the same values 
>>>>>>>>> for Container, Section, Identifier, Offset, and Size?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> 
>>>>>>>>> -Mark
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Oct 19, 2021, at 1:22 AM, Jens M. Kofoed <[email protected]> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Dear NIFI Users
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I have posted this mail in the developers mailing list and just want 
>>>>>>>>> to inform all of our about a very odd behavior we are facing.
>>>>>>>>> 
>>>>>>>>> The background:
>>>>>>>>> 
>>>>>>>>> We have data going between 2 different NIFI systems which has no 
>>>>>>>>> direct network access to each other. Therefore we calculate a SHA256 
>>>>>>>>> hash value of the content at system 1, before the flowfile and data 
>>>>>>>>> are combined and saved as a "flowfile-stream-v3" pkg file. The file 
>>>>>>>>> is then transported to system 2, where the pkg file is unpacked and 
>>>>>>>>> the flow can continue. To be sure about file integrity we calculate a 
>>>>>>>>> new sha256 at system 2. But sometimes we see that the sha256 gets 
>>>>>>>>> another value, which might suggest the file was corrupted. But 
>>>>>>>>> recalculating the sha256 again gives a new hash value.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> ----
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Tonight I had yet another file which didn't match the expected sha256 
>>>>>>>>> hash value. The content is a 1.7GB file and the Event Duration was 
>>>>>>>>> "00:00:17.539" to calculate the hash.
>>>>>>>>> 
>>>>>>>>> I have created a Retry loop, where the file will go to a Wait process 
>>>>>>>>> for delaying the file 1 minute and going back to the 
>>>>>>>>> CryptographicHashContent for a new calculation. After 3 retries the 
>>>>>>>>> file goes to the retries_exceeded and goes to a disabled process just 
>>>>>>>>> to be in a queue so I manually can look at it. This morning I 
>>>>>>>>> rerouted the file from my retries_exceeded queue back to the 
>>>>>>>>> CryptographicHashContent for a new calculation and this time it 
>>>>>>>>> calculated the correct hash value.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> THIS CAN'T BE TRUE :-( :-( But it is. - Something very very strange 
>>>>>>>>> is happening.
>>>>>>>>> 
>>>>>>>>> <image.png>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> We are running NiFi 1.13.2 in a 3 node cluster at Ubuntu 20.04.02 
>>>>>>>>> with openjdk version "1.8.0_292", OpenJDK Runtime Environment (build 
>>>>>>>>> 1.8.0_292-8u292-b10-0ubuntu1~20.04-b10), OpenJDK 64-Bit Server VM 
>>>>>>>>> (build 25.292-b10, mixed mode). Each server is a VM with 4 CPU, 8GB 
>>>>>>>>> Ram on VMware ESXi, 7.0.2. Each NIFI node is running at different vm 
>>>>>>>>> physical hosts.
>>>>>>>>> 
>>>>>>>>> I have inspected different logs to see if I can find any correlation 
>>>>>>>>> what happened at the same time as the file is going through my loop, 
>>>>>>>>> but there are no event/task at that exact time.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> System 1:
>>>>>>>>> 
>>>>>>>>> At 10/19/2021 00:15:11.247 CEST my file is going through a 
>>>>>>>>> CryptographicHashContent: SHA256 value: 
>>>>>>>>> dd4cc7ef8dbc8d70528e8aa788581f0ab88d297c9c9f39b6b542df68952efd20
>>>>>>>>> 
>>>>>>>>> The file is exported as a "FlowFile Stream, v3" to System 2
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> SYSTEM 2:
>>>>>>>>> 
>>>>>>>>> At 10/19/2021 00:18:10.528 CEST the file is going through a 
>>>>>>>>> CryptographicHashContent: SHA256 value: 
>>>>>>>>> f6f0909aacae4952f10f6fa7704f3e55d0481ec211d495993550aedbb3fe0819
>>>>>>>>> 
>>>>>>>>> <image.png>
>>>>>>>>> 
>>>>>>>>> At 10/19/2021 00:19:08.996 CEST the file is going through the same 
>>>>>>>>> CryptographicHashContent at system 2: SHA256 value: 
>>>>>>>>> f6f0909aacae4952f10f6fa7704f3e55d0481ec211d495993550aedbb3fe0819
>>>>>>>>> 
>>>>>>>>> At 10/19/2021 00:20:04.376 CEST the file is going through the same a 
>>>>>>>>> CryptographicHashContent at system 2: SHA256 value: 
>>>>>>>>> f6f0909aacae4952f10f6fa7704f3e55d0481ec211d495993550aedbb3fe0819
>>>>>>>>> 
>>>>>>>>> At 10/19/2021 00:21:01.711 CEST the file is going through the same a 
>>>>>>>>> CryptographicHashContent at system 2: SHA256 value: 
>>>>>>>>> f6f0909aacae4952f10f6fa7704f3e55d0481ec211d495993550aedbb3fe0819
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> At 10/19/2021 06:07:43.376 CEST the file is going through the same a 
>>>>>>>>> CryptographicHashContent at system 2: SHA256 value: 
>>>>>>>>> dd4cc7ef8dbc8d70528e8aa788581f0ab88d297c9c9f39b6b542df68952efd20
>>>>>>>>> 
>>>>>>>>> <image.png>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> How on earth can this happen???
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Kind Regards
>>>>>>>>> 
>>>>>>>>> Jens M. Kofoed
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> <Repro.json>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> <Try_to_recreate_Jens_Challenge.json>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>> 
>>>> 
> 

Reply via email to