[ 
https://issues.apache.org/jira/browse/MINIFICPP-39?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16211184#comment-16211184
 ] 

ASF GitHub Bot commented on MINIFICPP-39:
-----------------------------------------

Github user achristianson commented on a diff in the pull request:

    https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145731149
  
    --- Diff: libminifi/include/core/ProcessSession.h ---
    @@ -151,11 +152,47 @@ class ProcessSession {
       bool keepSource,
                   uint64_t offset, char inputDelimiter);
     
    +  /**
    +   * Exports the data stream to a file
    +   * @param string file to export stream to
    +   * @param flow flow file
    +   * @param bool whether or not to keep the content in the flow file
    +   */
    +  bool exportContent(const std::string &destination,
    --- End diff --
    
    Export is simply meant to be the inverse of import.
    
    Digging into the code, export is used in this change set in 
UnfocusArchiveEntry in order to export the flow file content into a 
scratch/working location so that it can be re-assembled back into an archive.
    
    We're ultimately calling archive_write_data (line 286 of 
UnfocusArchiveEntry.cpp), which takes data from a byte buffer. Therefore, we 
don't necessarily require a persistent filesystem for this change, as this 
scratch/working area could be in RAM or some other medium as long as there's 
enough space.
    
    As it stands, these new archive processors do depend on persistent 
filesystem storage, but this addition of exportContent does not result in 
ProcessSession or any core component depending on any storage implementation 
where it previously did not. We're simply adding an mirror capability to import 
which is optional to use, and where the caller is responsible for environmental 
considerations/requirements such as persistent storage.
    
    Assuming we move these new archive processors into an extension, a current 
prerequisite of using that extension will be persistent filesystem storage. 
This should be documented. This leaves open the door to future 
implementations/improvements which use RAM or some other medium to reconstitute 
the archive, but I think that level of functionality is not required 
immediately.


> Create FocusArchive processor
> -----------------------------
>
>                 Key: MINIFICPP-39
>                 URL: https://issues.apache.org/jira/browse/MINIFICPP-39
>             Project: NiFi MiNiFi C++
>          Issue Type: Task
>            Reporter: Andrew Christianson
>            Assignee: Andrew Christianson
>            Priority: Minor
>
> Create an FocusArchive processor which implements a lens over an archive 
> (tar, etc.). A concise, though informal, definition of a lens is as follows:
> "Essentially, they represent the act of “peering into” or “focusing in on” 
> some particular piece/path of a complex data object such that you can more 
> precisely target particular operations without losing the context or 
> structure of the overall data you’re working with." 
> https://medium.com/@dtipson/functional-lenses-d1aba9e52254#.hdgsvbraq
> Why an FocusArchive in MiNiFi? Simply put, it will enable us to "focus in on" 
> an entry in the archive, perform processing *in-context* of that entry, then 
> re-focus on the overall archive. This allows for transformation or other 
> processing of an entry in the archive without losing the overall context of 
> the archive.
> Initial format support is tar, due to its simplicity and ubiquity.
> Attributes:
> - Path (the path in the archive to focus; "/" to re-focus the overall archive)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to