Deleting the old files could certainly cause some problems.

The weird thing is that it shows that you have 10,000+ FlowFiles, each of which 
is 0 bytes.
Is that normal for your flow?

Could you try running the following against your content repo:

find . -size +1M

find . | wc -l

Curious how many files there are and how many are "large" files.



> On Jun 15, 2016, at 5:02 PM, Ricky Saltzer <[email protected]> wrote:
> 
> Is it safe to manually remove some of the older files in the repository to
> avoid our disk from filling up?
> 
> On Wed, Jun 15, 2016 at 4:55 PM, Ricky Saltzer <[email protected]> wrote:
> 
>> Just a reminder, I just today noticed the "archive.enabled" option was
>> false and changed it to true.
>> 
>> $ find . -type f -ls | grep archive | wc -l
>> 0
>> 
>> 
>> 
>> On Wed, Jun 15, 2016 at 4:53 PM, Mark Payne <[email protected]> wrote:
>> 
>>> OK, thanks. It doesn't appear that it believes there is anything to
>>> reclaim.
>>> 
>>> Can you try going to your content repository and running:
>>> 
>>> find . -type f -ls | grep archive
>>> 
>>> Curious as to how much data it has archived.
>>> 
>>>> On Jun 15, 2016, at 4:48 PM, Ricky Saltzer <[email protected]> wrote:
>>>> 
>>>> Oh sorry! Trying again
>>>> 
>>>> [1]
>>>> 
>>> https://gist.githubusercontent.com/rickysaltzer/b00196a3881c052df9b38b418722cd02/raw/279a1bc8c60530426732eb7b653de1f3f74574e2/gistfile1.txt
>>>> 
>>>> 
>>>> On Wed, Jun 15, 2016 at 4:38 PM, Ricky Saltzer <[email protected]>
>>> wrote:
>>>> 
>>>>> I should also mention, I just realized that our worker nodes are on
>>> 0.5.1,
>>>>> and for some reason I missed updating the master from 0.4.0. I'm sure
>>> that
>>>>> is not helping.
>>>>> 
>>>>> On Wed, Jun 15, 2016 at 4:36 PM, Ricky Saltzer <[email protected]>
>>> wrote:
>>>>> 
>>>>>> Looks like the threads are parked and waiting [1]
>>>>>> 
>>>>>> [1]
>>>>>> 
>>> http://github.mtv.cloudera.com/gist/ricky/7a5d89f2eeba58e2206d/raw/0e2b446ca049a8b5f27298c700ac709772d2847c/gistfile1.txt
>>>>>> 
>>>>>> On Wed, Jun 15, 2016 at 4:33 PM, Joe Witt <[email protected]> wrote:
>>>>>> 
>>>>>>> thanks Ricky - then please take a look at mark's note as that is
>>>>>>> probably more relevant to your case.
>>>>>>> 
>>>>>>> On Wed, Jun 15, 2016 at 4:32 PM, Ricky Saltzer <[email protected]>
>>>>>>> wrote:
>>>>>>>> Hey Joe -
>>>>>>>> 
>>>>>>>> The NiFi web UI currently reads as:
>>>>>>>> 
>>>>>>>> Active threads: 3
>>>>>>>> Queued: 10,173 / 0 bytes
>>>>>>>> Connected nodes: 2 / 2
>>>>>>>> Stats last refreshed: 13:31:28 PDT
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Wed, Jun 15, 2016 at 4:29 PM, Joe Witt <[email protected]>
>>> wrote:
>>>>>>>> 
>>>>>>>>> And the data remains?  If so that is an interesting data point I
>>>>>>>>> think.  So to mark's point how much data do you have queued up
>>>>>>>>> actively in the flow then on that nodes?  Number of objects you
>>>>>>>>> mention is 3273 files corresponding to 825GB in the content
>>>>>>>>> repository.  Does NiFi see those 825GB worth of data as being in
>>> the
>>>>>>>>> flow/queued up?  And then if that is the case are we talking about
>>> a
>>>>>>>>> roughly 1TB repo and so the reported value seems correct and this
>>> is
>>>>>>>>> simply a case of queueing near to the limit your system can hold?
>>>>>>>>> 
>>>>>>>>> On Wed, Jun 15, 2016 at 4:24 PM, Ricky Saltzer <[email protected]
>>>> 
>>>>>>> wrote:
>>>>>>>>>> I have two nodes in clustered mode. I have the other node that
>>> isn't
>>>>>>>>>> filling up as my primary. I've actually already restarted nifi on
>>>>>>> the
>>>>>>>>> node
>>>>>>>>>> which has the large repository a few times.
>>>>>>>>>> 
>>>>>>>>>> On Wed, Jun 15, 2016 at 4:22 PM, Joe Witt <[email protected]>
>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Ricky,
>>>>>>>>>>> 
>>>>>>>>>>> If you restart nifi and then find that it cleans those things up
>>> I
>>>>>>>>>>> believe then it is related to the defects corrected in the
>>> 0.5/0.6
>>>>>>>>>>> timeframe.
>>>>>>>>>>> 
>>>>>>>>>>> Is restarting an option for you at this time.  You agree mark?
>>>>>>>>>>> 
>>>>>>>>>>> Thanks
>>>>>>>>>>> Joe
>>>>>>>>>>> 
>>>>>>>>>>> On Wed, Jun 15, 2016 at 4:21 PM, Ricky Saltzer <
>>> [email protected]
>>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>>>> Hey Mark -
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks for the quick reply! This is our production system so
>>> it's
>>>>>>>>>>>> unfortunately running 0.4.0. There are currently 3273 files,
>>>>>>> with some
>>>>>>>>>>>> files dating back to May 18th. The content repository itself is
>>>>>>> 825G.
>>>>>>>>>>>> 
>>>>>>>>>>>> Ricky
>>>>>>>>>>>> 
>>>>>>>>>>>> On Wed, Jun 15, 2016 at 4:17 PM, Mark Payne <
>>>>>>> [email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hey Ricky
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The reclaim process is pretty much continuous. What version of
>>>>>>> NiFi
>>>>>>>>> are
>>>>>>>>>>>>> you running?
>>>>>>>>>>>>> I know there was an issue with this a while back that caused it
>>>>>>> not
>>>>>>>>> to
>>>>>>>>>>>>> cleanup properly.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Also, how much data & how many FlowFiles do you have queued up
>>>>>>> in
>>>>>>>>> your
>>>>>>>>>>>>> flow?
>>>>>>>>>>>>> Data won't be archived or reclaimed if in the flow.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> -Mark
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Jun 15, 2016, at 4:04 PM, Ricky Saltzer <
>>>>>>> [email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hey guys -
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I recently discovered I didn't have my "archive.enabled"
>>>>>>> option
>>>>>>>>> set to
>>>>>>>>>>>>> true
>>>>>>>>>>>>>> after my disk filled up to 95%. I enabled it and then set the
>>>>>>>>>>> retention
>>>>>>>>>>>>>> period to 12 hours and 50% (default values). However, after
>>>>>>>>> restarting
>>>>>>>>>>>>>> NiFi, I am not seeing any disk space reclaimed.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I'm curious, is the reclaiming process periodic or continuous?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>> ricky
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Ricky Saltzer
>>>>>>>>>>>> http://www.cloudera.com
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Ricky Saltzer
>>>>>>>>>> http://www.cloudera.com
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Ricky Saltzer
>>>>>>>> http://www.cloudera.com
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Ricky Saltzer
>>>>>> http://www.cloudera.com
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Ricky Saltzer
>>>>> http://www.cloudera.com
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Ricky Saltzer
>>>> http://www.cloudera.com
>>> 
>>> 
>> 
>> 
>> --
>> Ricky Saltzer
>> http://www.cloudera.com
>> 
>> 
> 
> 
> -- 
> Ricky Saltzer
> http://www.cloudera.com

Reply via email to