Thanks a lot for confirming my suspicions. One last clarification: The WAL is different from the swapping concept, correct? I guess it's way faster to swap in a dedicated "dump" than replaying a WAL.
On Wed, Feb 17, 2016 at 7:53 PM, Joe Witt <joe.w...@gmail.com> wrote: > Lars, > > You are right about the thought process. We've never provided solid > guidance here but we should. It is definitely the case that flow file > content is streamed to and from the underlying repository and the only > way to access it is through that API. Thus well behaved extensions > and the framework itself can handle basically data as large as the > underlying repository has space for. For the flow file attributes > though these are held in memory in a map with each flowfile object. > So it is important to avoid having vast (undefined) quantities of > attributes or attributes with really large (undefined) values. > > There are things we can and should do to make even this relatively > transparent to the users and it is why actually we support swapping > flowfiles to disk when there are large queues because even those inmem > attributes can really add up. > > Thanks > Joe > > On Wed, Feb 17, 2016 at 11:06 AM, Lars Francke <lars.fran...@gmail.com> > wrote: > > Hi and sorry for all these questions. > > > > I know that FlowFile content is persisted to the content_repository and > can > > handle reasonably large amounts of data. Is the same true for attributes? > > > > I download JSON files (up to 200kb I'd say) and I want to insert them as > > they are into a PostgreSQL JSONB column. I'd love to use the PutSQL > > processor for that but it requires parameters in attributes. > > > > I have a feeling that putting large objects in attributes is a bad idea? >