Lars, You are right about the thought process. We've never provided solid guidance here but we should. It is definitely the case that flow file content is streamed to and from the underlying repository and the only way to access it is through that API. Thus well behaved extensions and the framework itself can handle basically data as large as the underlying repository has space for. For the flow file attributes though these are held in memory in a map with each flowfile object. So it is important to avoid having vast (undefined) quantities of attributes or attributes with really large (undefined) values.
There are things we can and should do to make even this relatively transparent to the users and it is why actually we support swapping flowfiles to disk when there are large queues because even those inmem attributes can really add up. Thanks Joe On Wed, Feb 17, 2016 at 11:06 AM, Lars Francke <[email protected]> wrote: > Hi and sorry for all these questions. > > I know that FlowFile content is persisted to the content_repository and can > handle reasonably large amounts of data. Is the same true for attributes? > > I download JSON files (up to 200kb I'd say) and I want to insert them as > they are into a PostgreSQL JSONB column. I'd love to use the PutSQL > processor for that but it requires parameters in attributes. > > I have a feeling that putting large objects in attributes is a bad idea?
