[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi-minifi-cpp/pull/148


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-23 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r146259050
  
--- Diff: libminifi/include/core/ProcessSessionReadCallback.h ---
@@ -0,0 +1,33 @@
+#ifndef __PROCESS_SESSION_READ_CALLBACK_H__
--- End diff --

Since you broke this out into a different file ( thanks! ) it will need a 
license header. 


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-21 Thread minifirocks
Github user minifirocks commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r146120896
  
--- Diff: libminifi/include/core/ProcessSession.h ---
@@ -151,11 +152,47 @@ class ProcessSession {
   bool keepSource,
   uint64_t offset, char inputDelimiter);
 
+  /**
+   * Exports the data stream to a file
+   * @param string file to export stream to
+   * @param flow flow file
+   * @param bool whether or not to keep the content in the flow file
+   */
+  bool exportContent(const std::string ,
--- End diff --

the mergeContent.h
// Archive Class
class ArchiveMerge {
public: 
do not reply on persistent storage, it use archive_write_open(arch, this, 
NULL, archive_write, NULL); to write the process content in RAM into flowfile.


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread achristianson
Github user achristianson commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145748583
  
--- Diff: libminifi/include/FlowFileRecord.h ---
@@ -164,6 +164,11 @@ class FlowFileRecord : public core::FlowFile, public 
io::Serializable {
 return content_full_fath_;
   }
 
+  /**
+   * Cleanly relinquish a resource claim
+   */
--- End diff --

The flowfile is releasing its own claim to the resource. There are a few 
cases where the flow file has a claim to a resource that it should release to 
any other components that might have a claim on the resource (which might 
currently be a hypothetical-only case).


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread calebj
Github user calebj commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145747588
  
--- Diff: libminifi/include/core/FlowConfiguration.h ---
@@ -35,6 +35,8 @@
 #include "processors/ExecuteProcess.h"
 #include "processors/AppendHostInfo.h"
 #include "processors/MergeContent.h"
+#include "processors/FocusArchiveEntry.h"
--- End diff --

Clarified in the latest commit


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread calebj
Github user calebj commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145747665
  
--- Diff: libminifi/include/FlowFileRecord.h ---
@@ -164,6 +164,11 @@ class FlowFileRecord : public core::FlowFile, public 
io::Serializable {
 return content_full_fath_;
   }
 
+  /**
+   * Cleanly relinquish a resource claim
+   */
--- End diff --

Clarified in the latest commit


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread calebj
Github user calebj commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145743863
  
--- Diff: libminifi/src/processors/FocusArchiveEntry.cpp ---
@@ -0,0 +1,340 @@
+/**
+ * @file FocusArchiveEntry.cpp
+ * FocusArchiveEntry class implementation
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#include "processors/FocusArchiveEntry.h"
+
+#include 
+#include 
+
+#include 
+
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "core/ProcessContext.h"
+#include "core/ProcessSession.h"
+
+#include "json/json.h"
+#include "json/writer.h"
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace processors {
+
+core::Property FocusArchiveEntry::Path(
+"Path",
+"The path within the archive to focus (\"/\" to focus the total 
archive)",
+"");
+core::Relationship FocusArchiveEntry::Success(
+"success",
+"success operational on the flow record");
+
+bool 
FocusArchiveEntry::set_del_or_update_attr(std::shared_ptr 
flowFile, const std::string key, std::string* value) const {
+  if (value == nullptr)
+return flowFile->removeAttribute(key);
+  else if (flowFile->updateAttribute(key, *value))
+return true;
+  else
+return flowFile->addAttribute(key, *value);
+}
+
+void FocusArchiveEntry::initialize() {
+  //! Set the supported properties
+  std::set properties;
+  properties.insert(Path);
+  setSupportedProperties(properties);
+  //! Set the supported relationships
+  std::set relationships;
+  relationships.insert(Success);
+  setSupportedRelationships(relationships);
+}
+
+void FocusArchiveEntry::onTrigger(core::ProcessContext *context,
+  core::ProcessSession *session) {
+  auto flowFile = session->get();
+  std::shared_ptr flowFileRecord = 
std::static_pointer_cast(flowFile);
+
+  if (!flowFile) {
+return;
+  }
+
+  std::string targetEntry;
+  context->getProperty(Path.getName(), targetEntry);
+
+  // Extract archive contents
+  ArchiveMetadata archiveMetadata;
+  archiveMetadata.focusedEntry = targetEntry;
+  ReadCallback cb();
+  session->read(flowFile, );
+
+  // For each extracted entry, import & stash to key
+  std::string targetEntryStashKey;
+
+  for (auto  : archiveMetadata.entryMetadata) {
+if (entryMetadata.entryType == AE_IFREG) {
+  logger_->log_info("FocusArchiveEntry importing %s from %s",
+  entryMetadata.entryName.c_str(),
+  entryMetadata.tmpFileName.c_str());
+  session->import(entryMetadata.tmpFileName, flowFile, false, 0);
+  char stashKey[37];
+  uuid_t stashKeyUuid;
+  uuid_generate(stashKeyUuid);
+  uuid_unparse_lower(stashKeyUuid, stashKey);
+  logger_->log_debug(
+  "FocusArchiveEntry generated stash key %s for entry %s",
+  stashKey,
+  entryMetadata.entryName.c_str());
+  entryMetadata.stashKey.assign(stashKey);
+
+  if (entryMetadata.entryName == targetEntry) {
+targetEntryStashKey = entryMetadata.stashKey;
+  }
+
+  // Stash the content
+  session->stash(entryMetadata.stashKey, flowFile);
+}
+  }
+
+  // Restore target archive entry
+  if (targetEntryStashKey != "") {
+session->restore(targetEntryStashKey, flowFile);
+  } else {
+logger_->log_warn(
+  "FocusArchiveEntry failed to locate target entry: %s",
+  targetEntry.c_str());
+  }
+
+  // Set new/updated lens stack to attribute
+  {
+Json::Value lensStack;
+Json::Reader reader;
+
+std::string existingLensStack;
+
+if (flowFile->getAttribute("lens.archive.stack", existingLensStack)) {
+  

[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread calebj
Github user calebj commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145744191
  
--- Diff: libminifi/include/processors/FocusArchiveEntry.h ---
@@ -0,0 +1,115 @@
+/**
+ * @file FocusArchiveEntry.h
+ * FocusArchiveEntry class declaration
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#ifndef LIBMINIFI_INCLUDE_PROCESSORS_FOCUSARCHIVEENTRY_H_
+#define LIBMINIFI_INCLUDE_PROCESSORS_FOCUSARCHIVEENTRY_H_
+
+#include 
+#include 
+#include 
+
+#include "FlowFileRecord.h"
+#include "core/Processor.h"
+#include "core/ProcessSession.h"
+#include "core/Core.h"
+#include "core/logging/LoggerConfiguration.h"
+#include "core/Resource.h"
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace processors {
+
+using logging::LoggerFactory;
+
+//! FocusArchiveEntry Class
+class FocusArchiveEntry : public core::Processor {
+ public:
+  //! Constructor
+  /*!
+   * Create a new processor
+   */
+  explicit FocusArchiveEntry(std::string name, uuid_t uuid = NULL)
+  : core::Processor(name, uuid),
+logger_(logging::LoggerFactory::getLogger()) {
+  }
+  //! Destructor
+  virtual ~FocusArchiveEntry()   {
+  }
+  //! Processor Name
+  static constexpr char const* ProcessorName = "FocusArchiveEntry";
+  //! Supported Properties
+  static core::Property Path;
+  //! Supported Relationships
+  static core::Relationship Success;
+
+  bool set_del_or_update_attr(std::shared_ptr, const 
std::string, std::string*) const;
--- End diff --

Fixed in d2e7e34ab8b331ac484b9b16bd51455799a1502b


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread achristianson
Github user achristianson commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145743112
  
--- Diff: libminifi/src/core/ProcessSession.cpp ---
@@ -799,6 +799,152 @@ void ProcessSession::import(std::string source, 
std::shared_ptr
   }
 }
 
+bool ProcessSession::exportContent(
+const std::string ,
+const std::string ,
+std::shared_ptr ,
+bool keepContent) {
+  logger_->log_info(
+  "Exporting content of %s to %s",
+  flow->getUUIDStr().c_str(),
+  destination.c_str());
+
+  ReadCallback cb(tmpFile, destination, logger_);
+  read(flow, );
+
+  logger_->log_info("Committing %s", destination.c_str());
+  bool commit_ok = cb.commit();
+
+  if (commit_ok) {
+logger_->log_info("Commit OK.");
+  } else {
+logger_->log_error(
+  "Commit of %s to %s failed!",
+  flow->getUUIDStr().c_str(),
+  destination.c_str());
+  }
+  return commit_ok;
+}
+
+bool ProcessSession::exportContent(
+const std::string ,
+std::shared_ptr ,
+bool keepContent) {
+  std::string tmpFileName = boost::filesystem::unique_path().native();
+  return exportContent(destination, tmpFileName, flow, keepContent);
+}
+
+ProcessSession::ReadCallback::ReadCallback(const std::string ,
+   const std::string ,
+   
std::shared_ptr logger)
+: _tmpFile(tmpFile),
+  _tmpFileOs(tmpFile, std::ios::binary),
+  _destFile(destFile),
+  logger_(logger) {
+}
+
+// Copy the entire file contents to the temporary file
+int64_t 
ProcessSession::ReadCallback::process(std::shared_ptr stream) {
+  // Copy file contents into tmp file
+  _writeSucceeded = false;
+  size_t size = 0;
+  uint8_t buffer[8192];
+  do {
+int read = stream->read(buffer, 8192);
+if (read < 0) {
+  return -1;
+}
+if (read == 0) {
+  break;
+}
+_tmpFileOs.write(reinterpret_cast(buffer), read);
+size += read;
+  } while (size < stream->getSize());
+  _writeSucceeded = true;
+  return size;
+}
+
+// Renames tmp file to final destination
+// Returns true if commit succeeded
+bool ProcessSession::ReadCallback::commit() {
+  bool success = false;
+
+  logger_->log_info("committing export operation to %s", 
_destFile.c_str());
+
+  if (_writeSucceeded) {
+_tmpFileOs.close();
+
+if (rename(_tmpFile.c_str(), _destFile.c_str())) {
+  logger_->log_info("commit export operation to %s failed because 
rename() call failed", _destFile.c_str());
+} else {
+  success = true;
+  logger_->log_info("commit export operation to %s succeeded", 
_destFile.c_str());
+}
+  } else {
+logger_->log_error("commit export operation to %s failed because write 
failed", _destFile.c_str());
+  }
+  return success;
+}
+
+// Clean up resources
+ProcessSession::ReadCallback::~ReadCallback() {
+  // Close tmp file
+  _tmpFileOs.close();
+
+  // Clean up tmp file, if necessary
+  unlink(_tmpFile.c_str());
+}
+
+
+void ProcessSession::stash(const std::string , 
std::shared_ptr flow) {
--- End diff --

@phrocker by 'tmp file,' are you referring to the stash claims?

We would want those to operate the same as the primary content claim. It 
looks like this may not be the case currently. Would adding the stash claims to 
the data stored/retrieved in FlowFileRecord's Serialize/Deserialize cover all 
the bases, or would other changes be required as well?


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread calebj
Github user calebj commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145743845
  
--- Diff: LICENSE ---
@@ -534,4 +534,68 @@ ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 
OUT OF OR IN
 CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
 
+This projects includes libarchive bundle (https://www.libarchive.org)
+which is available under a BSD License by Tim Kientzle and others
+
--- End diff --

Fixed in d2e7e34ab8b331ac484b9b16bd51455799a1502b


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread calebj
Github user calebj commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145744366
  
--- Diff: libminifi/include/core/ProcessSession.h ---
@@ -19,6 +19,7 @@
 #define __PROCESS_SESSION_H__
 
 #include 
+#include 
--- End diff --

Fixed in d2e7e34ab8b331ac484b9b16bd51455799a1502b


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread calebj
Github user calebj commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145745217
  
--- Diff: libminifi/include/processors/FocusArchiveEntry.h ---
@@ -0,0 +1,115 @@
+/**
+ * @file FocusArchiveEntry.h
+ * FocusArchiveEntry class declaration
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#ifndef LIBMINIFI_INCLUDE_PROCESSORS_FOCUSARCHIVEENTRY_H_
+#define LIBMINIFI_INCLUDE_PROCESSORS_FOCUSARCHIVEENTRY_H_
+
+#include 
+#include 
+#include 
+
+#include "FlowFileRecord.h"
+#include "core/Processor.h"
+#include "core/ProcessSession.h"
+#include "core/Core.h"
+#include "core/logging/LoggerConfiguration.h"
+#include "core/Resource.h"
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace processors {
+
+using logging::LoggerFactory;
--- End diff --

Fixed in d2e7e34ab8b331ac484b9b16bd51455799a1502b


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread achristianson
Github user achristianson commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145734644
  
--- Diff: libminifi/include/processors/FocusArchiveEntry.h ---
@@ -0,0 +1,115 @@
+/**
+ * @file FocusArchiveEntry.h
+ * FocusArchiveEntry class declaration
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#ifndef LIBMINIFI_INCLUDE_PROCESSORS_FOCUSARCHIVEENTRY_H_
+#define LIBMINIFI_INCLUDE_PROCESSORS_FOCUSARCHIVEENTRY_H_
+
+#include 
+#include 
+#include 
+
+#include "FlowFileRecord.h"
+#include "core/Processor.h"
+#include "core/ProcessSession.h"
+#include "core/Core.h"
+#include "core/logging/LoggerConfiguration.h"
+#include "core/Resource.h"
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace processors {
+
+using logging::LoggerFactory;
--- End diff --

I would opt to just not use the using statement here. Not worth breaking 
the convention for the one class.


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread calebj
Github user calebj commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145744343
  
--- Diff: libminifi/include/core/FlowFile.h ---
@@ -50,6 +50,32 @@ class FlowFile : public core::Connectable {
   void clearResourceClaim();
 
   /**
+   * Returns a pointer to this flow file record's
+   * claim at the given stash key
+   */
+  std::shared_ptr getStashClaim(const std::string );
+
+  /**
+   * Sets the given stash key to the inbound claim argument
+   */
+  void setStashClaim(const std::string , 
std::shared_ptr );
--- End diff --

Fixed in d2e7e34ab8b331ac484b9b16bd51455799a1502b


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread calebj
Github user calebj commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145743879
  
--- Diff: libminifi/src/processors/FocusArchiveEntry.cpp ---
@@ -0,0 +1,340 @@
+/**
+ * @file FocusArchiveEntry.cpp
+ * FocusArchiveEntry class implementation
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#include "processors/FocusArchiveEntry.h"
+
+#include 
+#include 
+
+#include 
+
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "core/ProcessContext.h"
+#include "core/ProcessSession.h"
+
+#include "json/json.h"
+#include "json/writer.h"
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace processors {
+
+core::Property FocusArchiveEntry::Path(
+"Path",
+"The path within the archive to focus (\"/\" to focus the total 
archive)",
+"");
+core::Relationship FocusArchiveEntry::Success(
+"success",
+"success operational on the flow record");
+
+bool 
FocusArchiveEntry::set_del_or_update_attr(std::shared_ptr 
flowFile, const std::string key, std::string* value) const {
+  if (value == nullptr)
+return flowFile->removeAttribute(key);
+  else if (flowFile->updateAttribute(key, *value))
+return true;
+  else
+return flowFile->addAttribute(key, *value);
+}
+
+void FocusArchiveEntry::initialize() {
+  //! Set the supported properties
+  std::set properties;
+  properties.insert(Path);
+  setSupportedProperties(properties);
+  //! Set the supported relationships
+  std::set relationships;
+  relationships.insert(Success);
+  setSupportedRelationships(relationships);
+}
+
+void FocusArchiveEntry::onTrigger(core::ProcessContext *context,
+  core::ProcessSession *session) {
+  auto flowFile = session->get();
+  std::shared_ptr flowFileRecord = 
std::static_pointer_cast(flowFile);
+
+  if (!flowFile) {
+return;
+  }
+
+  std::string targetEntry;
+  context->getProperty(Path.getName(), targetEntry);
+
+  // Extract archive contents
+  ArchiveMetadata archiveMetadata;
+  archiveMetadata.focusedEntry = targetEntry;
+  ReadCallback cb();
+  session->read(flowFile, );
+
+  // For each extracted entry, import & stash to key
+  std::string targetEntryStashKey;
+
+  for (auto  : archiveMetadata.entryMetadata) {
+if (entryMetadata.entryType == AE_IFREG) {
+  logger_->log_info("FocusArchiveEntry importing %s from %s",
+  entryMetadata.entryName.c_str(),
+  entryMetadata.tmpFileName.c_str());
+  session->import(entryMetadata.tmpFileName, flowFile, false, 0);
+  char stashKey[37];
+  uuid_t stashKeyUuid;
+  uuid_generate(stashKeyUuid);
+  uuid_unparse_lower(stashKeyUuid, stashKey);
+  logger_->log_debug(
+  "FocusArchiveEntry generated stash key %s for entry %s",
+  stashKey,
+  entryMetadata.entryName.c_str());
+  entryMetadata.stashKey.assign(stashKey);
+
+  if (entryMetadata.entryName == targetEntry) {
+targetEntryStashKey = entryMetadata.stashKey;
+  }
+
+  // Stash the content
+  session->stash(entryMetadata.stashKey, flowFile);
+}
+  }
+
+  // Restore target archive entry
+  if (targetEntryStashKey != "") {
+session->restore(targetEntryStashKey, flowFile);
+  } else {
+logger_->log_warn(
+  "FocusArchiveEntry failed to locate target entry: %s",
+  targetEntry.c_str());
+  }
+
+  // Set new/updated lens stack to attribute
+  {
+Json::Value lensStack;
+Json::Reader reader;
+
+std::string existingLensStack;
+
+if (flowFile->getAttribute("lens.archive.stack", existingLensStack)) {
+  

[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread calebj
Github user calebj commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145743902
  
--- Diff: libminifi/src/processors/FocusArchiveEntry.cpp ---
@@ -0,0 +1,340 @@
+/**
+ * @file FocusArchiveEntry.cpp
+ * FocusArchiveEntry class implementation
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#include "processors/FocusArchiveEntry.h"
+
+#include 
+#include 
+
+#include 
+
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "core/ProcessContext.h"
+#include "core/ProcessSession.h"
+
+#include "json/json.h"
+#include "json/writer.h"
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace processors {
+
+core::Property FocusArchiveEntry::Path(
+"Path",
+"The path within the archive to focus (\"/\" to focus the total 
archive)",
+"");
+core::Relationship FocusArchiveEntry::Success(
+"success",
+"success operational on the flow record");
+
+bool 
FocusArchiveEntry::set_del_or_update_attr(std::shared_ptr 
flowFile, const std::string key, std::string* value) const {
+  if (value == nullptr)
+return flowFile->removeAttribute(key);
+  else if (flowFile->updateAttribute(key, *value))
+return true;
+  else
+return flowFile->addAttribute(key, *value);
+}
+
+void FocusArchiveEntry::initialize() {
+  //! Set the supported properties
+  std::set properties;
+  properties.insert(Path);
+  setSupportedProperties(properties);
+  //! Set the supported relationships
+  std::set relationships;
+  relationships.insert(Success);
+  setSupportedRelationships(relationships);
+}
+
+void FocusArchiveEntry::onTrigger(core::ProcessContext *context,
+  core::ProcessSession *session) {
+  auto flowFile = session->get();
+  std::shared_ptr flowFileRecord = 
std::static_pointer_cast(flowFile);
+
+  if (!flowFile) {
+return;
+  }
+
+  std::string targetEntry;
+  context->getProperty(Path.getName(), targetEntry);
+
+  // Extract archive contents
+  ArchiveMetadata archiveMetadata;
+  archiveMetadata.focusedEntry = targetEntry;
+  ReadCallback cb();
+  session->read(flowFile, );
+
+  // For each extracted entry, import & stash to key
+  std::string targetEntryStashKey;
+
+  for (auto  : archiveMetadata.entryMetadata) {
+if (entryMetadata.entryType == AE_IFREG) {
+  logger_->log_info("FocusArchiveEntry importing %s from %s",
+  entryMetadata.entryName.c_str(),
+  entryMetadata.tmpFileName.c_str());
+  session->import(entryMetadata.tmpFileName, flowFile, false, 0);
+  char stashKey[37];
+  uuid_t stashKeyUuid;
--- End diff --

Fixed in d2e7e34ab8b331ac484b9b16bd51455799a1502b


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread calebj
Github user calebj commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145742734
  
--- Diff: LICENSE ---
@@ -534,4 +534,68 @@ ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 
OUT OF OR IN
 CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
 
+This projects includes libarchive bundle (https://www.libarchive.org)
+which is available under a BSD License by Tim Kientzle and others
+
+Copyright (c) 2003-2009 Tim Kientzle and other authors 
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+1. Redistributions of source code must retain the above copyright
+   notice, this list of conditions and the following disclaimer
+   in this position and unchanged.
+2. Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR
+IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT,
+INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 (END LICENSE TEXT)
+
+All libarchive C sources (including .c and .h files)
+and documentation files are subject to the copyright notice reproduced
+above.
+
+This libarchive includes below files 
+libarchive/archive_entry.c
+libarchive/archive_read_support_filter_compress.c
+libarchive/archive_write_add_filter_compress.c 
+which under a 3-clause UC Regents copyright as below
+/*-
+ * Copyright (c) 1993
+ *  The Regents of the University of California.  All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 4. Neither the name of the University nor the names of its contributors
--- End diff --

Neither. It's the 3-clause UC Regents copyright, which is in the three 
aforementioned source files. The originals also skip the third list item.


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145740328
  
--- Diff: CMakeLists.txt ---
@@ -101,6 +101,7 @@ set(CIVETWEB_ENABLE_SSL_DYNAMIC_LOADING OFF CACHE BOOL 
"Disable dynamic SSL libr
 set(CIVETWEB_ENABLE_CXX ON CACHE BOOL "Enable civet C++ library")
 add_subdirectory(thirdparty/yaml-cpp-yaml-cpp-0.5.3)
 add_subdirectory(thirdparty/civetweb-1.9.1 EXCLUDE_FROM_ALL)
+add_subdirectory(thirdparty/libarchive-3.3.2)
--- End diff --

@calebj I think @achristianson point is valid that it shouldn't be done 
here. I will submit a PR to the other PR and then when that is merged ( should 
be the next day ), you will get that for free via the rebase. So feel free to 
ignore any comment about "extensions" since that is on me. thanks!


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread calebj
Github user calebj commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145738276
  
--- Diff: CMakeLists.txt ---
@@ -101,6 +101,7 @@ set(CIVETWEB_ENABLE_SSL_DYNAMIC_LOADING OFF CACHE BOOL 
"Disable dynamic SSL libr
 set(CIVETWEB_ENABLE_CXX ON CACHE BOOL "Enable civet C++ library")
 add_subdirectory(thirdparty/yaml-cpp-yaml-cpp-0.5.3)
 add_subdirectory(thirdparty/civetweb-1.9.1 EXCLUDE_FROM_ALL)
+add_subdirectory(thirdparty/libarchive-3.3.2)
--- End diff --

I'm not sure how to do that, are there any examples?


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread achristianson
Github user achristianson commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145731149
  
--- Diff: libminifi/include/core/ProcessSession.h ---
@@ -151,11 +152,47 @@ class ProcessSession {
   bool keepSource,
   uint64_t offset, char inputDelimiter);
 
+  /**
+   * Exports the data stream to a file
+   * @param string file to export stream to
+   * @param flow flow file
+   * @param bool whether or not to keep the content in the flow file
+   */
+  bool exportContent(const std::string ,
--- End diff --

Export is simply meant to be the inverse of import.

Digging into the code, export is used in this change set in 
UnfocusArchiveEntry in order to export the flow file content into a 
scratch/working location so that it can be re-assembled back into an archive.

We're ultimately calling archive_write_data (line 286 of 
UnfocusArchiveEntry.cpp), which takes data from a byte buffer. Therefore, we 
don't necessarily require a persistent filesystem for this change, as this 
scratch/working area could be in RAM or some other medium as long as there's 
enough space.

As it stands, these new archive processors do depend on persistent 
filesystem storage, but this addition of exportContent does not result in 
ProcessSession or any core component depending on any storage implementation 
where it previously did not. We're simply adding an mirror capability to import 
which is optional to use, and where the caller is responsible for environmental 
considerations/requirements such as persistent storage.

Assuming we move these new archive processors into an extension, a current 
prerequisite of using that extension will be persistent filesystem storage. 
This should be documented. This leaves open the door to future 
implementations/improvements which use RAM or some other medium to reconstitute 
the archive, but I think that level of functionality is not required 
immediately.


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145724061
  
--- Diff: CMakeLists.txt ---
@@ -101,6 +101,7 @@ set(CIVETWEB_ENABLE_SSL_DYNAMIC_LOADING OFF CACHE BOOL 
"Disable dynamic SSL libr
 set(CIVETWEB_ENABLE_CXX ON CACHE BOOL "Enable civet C++ library")
 add_subdirectory(thirdparty/yaml-cpp-yaml-cpp-0.5.3)
 add_subdirectory(thirdparty/civetweb-1.9.1 EXCLUDE_FROM_ALL)
+add_subdirectory(thirdparty/libarchive-3.3.2)
--- End diff --

Great! We'll work on getting the other PR in an extension and this can be 
rebased. 


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread achristianson
Github user achristianson commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145717915
  
--- Diff: CMakeLists.txt ---
@@ -101,6 +101,7 @@ set(CIVETWEB_ENABLE_SSL_DYNAMIC_LOADING OFF CACHE BOOL 
"Disable dynamic SSL libr
 set(CIVETWEB_ENABLE_CXX ON CACHE BOOL "Enable civet C++ library")
 add_subdirectory(thirdparty/yaml-cpp-yaml-cpp-0.5.3)
 add_subdirectory(thirdparty/civetweb-1.9.1 EXCLUDE_FROM_ALL)
+add_subdirectory(thirdparty/libarchive-3.3.2)
--- End diff --

Both this and MINIFICPP-72 use libarchive. Not opposed to this being moved 
to an extension, but it needs to be coordinated between the two.


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread apiri
Github user apiri commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145714071
  
--- Diff: LICENSE ---
@@ -534,4 +534,68 @@ ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 
OUT OF OR IN
 CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
 
+This projects includes libarchive bundle (https://www.libarchive.org)
+which is available under a BSD License by Tim Kientzle and others
+
+Copyright (c) 2003-2009 Tim Kientzle and other authors 
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+1. Redistributions of source code must retain the above copyright
+   notice, this list of conditions and the following disclaimer
+   in this position and unchanged.
+2. Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR
+IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT,
+INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 (END LICENSE TEXT)
+
+All libarchive C sources (including .c and .h files)
+and documentation files are subject to the copyright notice reproduced
+above.
+
+This libarchive includes below files 
+libarchive/archive_entry.c
+libarchive/archive_read_support_filter_compress.c
+libarchive/archive_write_add_filter_compress.c 
+which under a 3-clause UC Regents copyright as below
+/*-
+ * Copyright (c) 1993
+ *  The Regents of the University of California.  All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 4. Neither the name of the University nor the names of its contributors
--- End diff --

I now see this was commit was taken from the other, outstanding PR, so will 
make sure we get that taken care of there and we can cherry pick the work from 
here onto that.  I believe a number of these changes have already been covered 
there


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145714089
  
--- Diff: LICENSE ---
@@ -534,4 +534,68 @@ ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 
OUT OF OR IN
 CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
 
+This projects includes libarchive bundle (https://www.libarchive.org)
+which is available under a BSD License by Tim Kientzle and others
+
+Copyright (c) 2003-2009 Tim Kientzle and other authors 
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+1. Redistributions of source code must retain the above copyright
+   notice, this list of conditions and the following disclaimer
+   in this position and unchanged.
+2. Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR
+IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT,
+INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 (END LICENSE TEXT)
+
+All libarchive C sources (including .c and .h files)
+and documentation files are subject to the copyright notice reproduced
+above.
+
+This libarchive includes below files 
+libarchive/archive_entry.c
+libarchive/archive_read_support_filter_compress.c
+libarchive/archive_write_add_filter_compress.c 
+which under a 3-clause UC Regents copyright as below
+/*-
+ * Copyright (c) 1993
+ *  The Regents of the University of California.  All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 4. Neither the name of the University nor the names of its contributors
--- End diff --

I was assuming that was the answer and the fourth clause was cherry-picked.

I assume the the author then simply meant "which under a 3-clause UC 
Regents [License] as below" since the copyright is a subset of the license.

Thanks.


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread apiri
Github user apiri commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145713014
  
--- Diff: LICENSE ---
@@ -534,4 +534,68 @@ ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 
OUT OF OR IN
 CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
 
+This projects includes libarchive bundle (https://www.libarchive.org)
+which is available under a BSD License by Tim Kientzle and others
+
--- End diff --

archive_getdate.c which is still bundled with this commit is listed as 
public domain and we should have appropriate attribution in our license


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread apiri
Github user apiri commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145711436
  
--- Diff: LICENSE ---
@@ -534,4 +534,68 @@ ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 
OUT OF OR IN
 CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
 
+This projects includes libarchive bundle (https://www.libarchive.org)
+which is available under a BSD License by Tim Kientzle and others
+
+Copyright (c) 2003-2009 Tim Kientzle and other authors 
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+1. Redistributions of source code must retain the above copyright
+   notice, this list of conditions and the following disclaimer
+   in this position and unchanged.
+2. Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR
+IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT,
+INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 (END LICENSE TEXT)
+
+All libarchive C sources (including .c and .h files)
+and documentation files are subject to the copyright notice reproduced
+above.
+
+This libarchive includes below files 
+libarchive/archive_entry.c
+libarchive/archive_read_support_filter_compress.c
+libarchive/archive_write_add_filter_compress.c 
+which under a 3-clause UC Regents copyright as below
+/*-
+ * Copyright (c) 1993
+ *  The Regents of the University of California.  All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 4. Neither the name of the University nor the names of its contributors
--- End diff --

Oh, interesting.  I now see there is a 3-clause below the header that the 
maintainer (I believe) of libarchive has as a 2 clause


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread apiri
Github user apiri commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145711080
  
--- Diff: LICENSE ---
@@ -534,4 +534,68 @@ ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 
OUT OF OR IN
 CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
 
+This projects includes libarchive bundle (https://www.libarchive.org)
+which is available under a BSD License by Tim Kientzle and others
+
+Copyright (c) 2003-2009 Tim Kientzle and other authors 
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+1. Redistributions of source code must retain the above copyright
+   notice, this list of conditions and the following disclaimer
+   in this position and unchanged.
+2. Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR
+IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT,
+INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 (END LICENSE TEXT)
+
+All libarchive C sources (including .c and .h files)
+and documentation files are subject to the copyright notice reproduced
+above.
+
+This libarchive includes below files 
+libarchive/archive_entry.c
+libarchive/archive_read_support_filter_compress.c
+libarchive/archive_write_add_filter_compress.c 
+which under a 3-clause UC Regents copyright as below
+/*-
+ * Copyright (c) 1993
+ *  The Regents of the University of California.  All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 4. Neither the name of the University nor the names of its contributors
--- End diff --

Certainly is listed as a 3-clause.

archive_entry.c, archive_read_support_filter_compress.c, and 
archive_write_add_filter_compress.c all have 2 clause headers


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145688790
  
--- Diff: libminifi/include/core/ProcessSession.h ---
@@ -19,6 +19,7 @@
 #define __PROCESS_SESSION_H__
 
 #include 
+#include 
--- End diff --

Please don't use boost in core components. We should find another way to do 
this and extract this functionality to a boost supporting extension. 


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145690546
  
--- Diff: libminifi/include/processors/FocusArchiveEntry.h ---
@@ -0,0 +1,115 @@
+/**
+ * @file FocusArchiveEntry.h
+ * FocusArchiveEntry class declaration
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#ifndef LIBMINIFI_INCLUDE_PROCESSORS_FOCUSARCHIVEENTRY_H_
+#define LIBMINIFI_INCLUDE_PROCESSORS_FOCUSARCHIVEENTRY_H_
+
+#include 
+#include 
+#include 
+
+#include "FlowFileRecord.h"
+#include "core/Processor.h"
+#include "core/ProcessSession.h"
+#include "core/Core.h"
+#include "core/logging/LoggerConfiguration.h"
+#include "core/Resource.h"
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace processors {
+
+using logging::LoggerFactory;
--- End diff --

I'm not necessarily against a using statement for loggers, but this would 
be the only one. I would like to see us to continue following convention 
outside of the extensions directory. 


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145688494
  
--- Diff: libminifi/include/core/FlowConfiguration.h ---
@@ -35,6 +35,8 @@
 #include "processors/ExecuteProcess.h"
 #include "processors/AppendHostInfo.h"
 #include "processors/MergeContent.h"
+#include "processors/FocusArchiveEntry.h"
--- End diff --

This isn't necessary if you use an extension. 


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145691644
  
--- Diff: libminifi/src/processors/FocusArchiveEntry.cpp ---
@@ -0,0 +1,340 @@
+/**
+ * @file FocusArchiveEntry.cpp
+ * FocusArchiveEntry class implementation
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#include "processors/FocusArchiveEntry.h"
+
+#include 
+#include 
+
+#include 
+
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "core/ProcessContext.h"
+#include "core/ProcessSession.h"
+
+#include "json/json.h"
+#include "json/writer.h"
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace processors {
+
+core::Property FocusArchiveEntry::Path(
+"Path",
+"The path within the archive to focus (\"/\" to focus the total 
archive)",
+"");
+core::Relationship FocusArchiveEntry::Success(
+"success",
+"success operational on the flow record");
+
+bool 
FocusArchiveEntry::set_del_or_update_attr(std::shared_ptr 
flowFile, const std::string key, std::string* value) const {
+  if (value == nullptr)
+return flowFile->removeAttribute(key);
+  else if (flowFile->updateAttribute(key, *value))
+return true;
+  else
+return flowFile->addAttribute(key, *value);
+}
+
+void FocusArchiveEntry::initialize() {
+  //! Set the supported properties
+  std::set properties;
+  properties.insert(Path);
+  setSupportedProperties(properties);
+  //! Set the supported relationships
+  std::set relationships;
+  relationships.insert(Success);
+  setSupportedRelationships(relationships);
+}
+
+void FocusArchiveEntry::onTrigger(core::ProcessContext *context,
+  core::ProcessSession *session) {
+  auto flowFile = session->get();
+  std::shared_ptr flowFileRecord = 
std::static_pointer_cast(flowFile);
+
+  if (!flowFile) {
+return;
+  }
+
+  std::string targetEntry;
+  context->getProperty(Path.getName(), targetEntry);
+
+  // Extract archive contents
+  ArchiveMetadata archiveMetadata;
+  archiveMetadata.focusedEntry = targetEntry;
+  ReadCallback cb();
+  session->read(flowFile, );
+
+  // For each extracted entry, import & stash to key
+  std::string targetEntryStashKey;
+
+  for (auto  : archiveMetadata.entryMetadata) {
+if (entryMetadata.entryType == AE_IFREG) {
+  logger_->log_info("FocusArchiveEntry importing %s from %s",
+  entryMetadata.entryName.c_str(),
+  entryMetadata.tmpFileName.c_str());
+  session->import(entryMetadata.tmpFileName, flowFile, false, 0);
+  char stashKey[37];
+  uuid_t stashKeyUuid;
--- End diff --

Is there any reason you were unable to use the ID generator? 


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145692040
  
--- Diff: libminifi/src/processors/FocusArchiveEntry.cpp ---
@@ -0,0 +1,340 @@
+/**
+ * @file FocusArchiveEntry.cpp
+ * FocusArchiveEntry class implementation
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#include "processors/FocusArchiveEntry.h"
+
+#include 
+#include 
+
+#include 
+
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "core/ProcessContext.h"
+#include "core/ProcessSession.h"
+
+#include "json/json.h"
+#include "json/writer.h"
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace processors {
+
+core::Property FocusArchiveEntry::Path(
+"Path",
+"The path within the archive to focus (\"/\" to focus the total 
archive)",
+"");
+core::Relationship FocusArchiveEntry::Success(
+"success",
+"success operational on the flow record");
+
+bool 
FocusArchiveEntry::set_del_or_update_attr(std::shared_ptr 
flowFile, const std::string key, std::string* value) const {
+  if (value == nullptr)
+return flowFile->removeAttribute(key);
+  else if (flowFile->updateAttribute(key, *value))
+return true;
+  else
+return flowFile->addAttribute(key, *value);
+}
+
+void FocusArchiveEntry::initialize() {
+  //! Set the supported properties
+  std::set properties;
+  properties.insert(Path);
+  setSupportedProperties(properties);
+  //! Set the supported relationships
+  std::set relationships;
+  relationships.insert(Success);
+  setSupportedRelationships(relationships);
+}
+
+void FocusArchiveEntry::onTrigger(core::ProcessContext *context,
+  core::ProcessSession *session) {
+  auto flowFile = session->get();
+  std::shared_ptr flowFileRecord = 
std::static_pointer_cast(flowFile);
+
+  if (!flowFile) {
+return;
+  }
+
+  std::string targetEntry;
+  context->getProperty(Path.getName(), targetEntry);
+
+  // Extract archive contents
+  ArchiveMetadata archiveMetadata;
+  archiveMetadata.focusedEntry = targetEntry;
+  ReadCallback cb();
+  session->read(flowFile, );
+
+  // For each extracted entry, import & stash to key
+  std::string targetEntryStashKey;
+
+  for (auto  : archiveMetadata.entryMetadata) {
+if (entryMetadata.entryType == AE_IFREG) {
+  logger_->log_info("FocusArchiveEntry importing %s from %s",
+  entryMetadata.entryName.c_str(),
+  entryMetadata.tmpFileName.c_str());
+  session->import(entryMetadata.tmpFileName, flowFile, false, 0);
+  char stashKey[37];
+  uuid_t stashKeyUuid;
+  uuid_generate(stashKeyUuid);
+  uuid_unparse_lower(stashKeyUuid, stashKey);
+  logger_->log_debug(
+  "FocusArchiveEntry generated stash key %s for entry %s",
+  stashKey,
+  entryMetadata.entryName.c_str());
+  entryMetadata.stashKey.assign(stashKey);
+
+  if (entryMetadata.entryName == targetEntry) {
+targetEntryStashKey = entryMetadata.stashKey;
+  }
+
+  // Stash the content
+  session->stash(entryMetadata.stashKey, flowFile);
+}
+  }
+
+  // Restore target archive entry
+  if (targetEntryStashKey != "") {
+session->restore(targetEntryStashKey, flowFile);
+  } else {
+logger_->log_warn(
+  "FocusArchiveEntry failed to locate target entry: %s",
+  targetEntry.c_str());
+  }
+
+  // Set new/updated lens stack to attribute
+  {
+Json::Value lensStack;
+Json::Reader reader;
+
+std::string existingLensStack;
+
+if (flowFile->getAttribute("lens.archive.stack", existingLensStack)) {
+  

[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145688465
  
--- Diff: libminifi/include/FlowFileRecord.h ---
@@ -164,6 +164,11 @@ class FlowFileRecord : public core::FlowFile, public 
io::Serializable {
 return content_full_fath_;
   }
 
+  /**
+   * Cleanly relinquish a resource claim
+   */
--- End diff --

What does relinquish a claim mean in this context? From what/where?


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145688909
  
--- Diff: libminifi/include/core/ProcessSession.h ---
@@ -151,11 +152,47 @@ class ProcessSession {
   bool keepSource,
   uint64_t offset, char inputDelimiter);
 
+  /**
+   * Exports the data stream to a file
+   * @param string file to export stream to
+   * @param flow flow file
+   * @param bool whether or not to keep the content in the flow file
+   */
+  bool exportContent(const std::string ,
+ std::shared_ptr ,
+ bool keepContent);
+
+  bool exportContent(const std::string ,
+ const std::string ,
+ std::shared_ptr ,
+ bool keepContent);
+
+  // Stash the content to a key
+  void stash(const std::string , std::shared_ptr flow);
+   // Restore content previously stashed to a key
+  void restore(const std::string , std::shared_ptr 
flow);
+
 // Prevent default copy constructor and assignment operation
 // Only support pass by reference or pointer
   ProcessSession(const ProcessSession ) = delete;
   ProcessSession =(const ProcessSession ) = delete;
 
+  class ReadCallback : public InputStreamCallback {
--- End diff --

Can we move this elsewhere? 


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145692128
  
--- Diff: libminifi/src/processors/FocusArchiveEntry.cpp ---
@@ -0,0 +1,340 @@
+/**
+ * @file FocusArchiveEntry.cpp
+ * FocusArchiveEntry class implementation
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#include "processors/FocusArchiveEntry.h"
+
+#include 
+#include 
+
+#include 
+
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "core/ProcessContext.h"
+#include "core/ProcessSession.h"
+
+#include "json/json.h"
+#include "json/writer.h"
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace processors {
+
+core::Property FocusArchiveEntry::Path(
+"Path",
+"The path within the archive to focus (\"/\" to focus the total 
archive)",
+"");
+core::Relationship FocusArchiveEntry::Success(
+"success",
+"success operational on the flow record");
+
+bool 
FocusArchiveEntry::set_del_or_update_attr(std::shared_ptr 
flowFile, const std::string key, std::string* value) const {
+  if (value == nullptr)
+return flowFile->removeAttribute(key);
+  else if (flowFile->updateAttribute(key, *value))
+return true;
+  else
+return flowFile->addAttribute(key, *value);
+}
+
+void FocusArchiveEntry::initialize() {
+  //! Set the supported properties
+  std::set properties;
+  properties.insert(Path);
+  setSupportedProperties(properties);
+  //! Set the supported relationships
+  std::set relationships;
+  relationships.insert(Success);
+  setSupportedRelationships(relationships);
+}
+
+void FocusArchiveEntry::onTrigger(core::ProcessContext *context,
+  core::ProcessSession *session) {
+  auto flowFile = session->get();
+  std::shared_ptr flowFileRecord = 
std::static_pointer_cast(flowFile);
+
+  if (!flowFile) {
+return;
+  }
+
+  std::string targetEntry;
+  context->getProperty(Path.getName(), targetEntry);
+
+  // Extract archive contents
+  ArchiveMetadata archiveMetadata;
+  archiveMetadata.focusedEntry = targetEntry;
+  ReadCallback cb();
+  session->read(flowFile, );
+
+  // For each extracted entry, import & stash to key
+  std::string targetEntryStashKey;
+
+  for (auto  : archiveMetadata.entryMetadata) {
+if (entryMetadata.entryType == AE_IFREG) {
+  logger_->log_info("FocusArchiveEntry importing %s from %s",
+  entryMetadata.entryName.c_str(),
+  entryMetadata.tmpFileName.c_str());
+  session->import(entryMetadata.tmpFileName, flowFile, false, 0);
+  char stashKey[37];
+  uuid_t stashKeyUuid;
+  uuid_generate(stashKeyUuid);
+  uuid_unparse_lower(stashKeyUuid, stashKey);
+  logger_->log_debug(
+  "FocusArchiveEntry generated stash key %s for entry %s",
+  stashKey,
+  entryMetadata.entryName.c_str());
+  entryMetadata.stashKey.assign(stashKey);
+
+  if (entryMetadata.entryName == targetEntry) {
+targetEntryStashKey = entryMetadata.stashKey;
+  }
+
+  // Stash the content
+  session->stash(entryMetadata.stashKey, flowFile);
+}
+  }
+
+  // Restore target archive entry
+  if (targetEntryStashKey != "") {
+session->restore(targetEntryStashKey, flowFile);
+  } else {
+logger_->log_warn(
+  "FocusArchiveEntry failed to locate target entry: %s",
+  targetEntry.c_str());
+  }
+
+  // Set new/updated lens stack to attribute
+  {
+Json::Value lensStack;
+Json::Reader reader;
+
+std::string existingLensStack;
+
+if (flowFile->getAttribute("lens.archive.stack", existingLensStack)) {
+  

[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145691328
  
--- Diff: libminifi/src/core/ProcessSession.cpp ---
@@ -799,6 +799,152 @@ void ProcessSession::import(std::string source, 
std::shared_ptr
   }
 }
 
+bool ProcessSession::exportContent(
+const std::string ,
+const std::string ,
+std::shared_ptr ,
+bool keepContent) {
+  logger_->log_info(
+  "Exporting content of %s to %s",
+  flow->getUUIDStr().c_str(),
+  destination.c_str());
+
+  ReadCallback cb(tmpFile, destination, logger_);
+  read(flow, );
+
+  logger_->log_info("Committing %s", destination.c_str());
+  bool commit_ok = cb.commit();
+
+  if (commit_ok) {
+logger_->log_info("Commit OK.");
+  } else {
+logger_->log_error(
+  "Commit of %s to %s failed!",
+  flow->getUUIDStr().c_str(),
+  destination.c_str());
+  }
+  return commit_ok;
+}
+
+bool ProcessSession::exportContent(
+const std::string ,
+std::shared_ptr ,
+bool keepContent) {
+  std::string tmpFileName = boost::filesystem::unique_path().native();
+  return exportContent(destination, tmpFileName, flow, keepContent);
+}
+
+ProcessSession::ReadCallback::ReadCallback(const std::string ,
+   const std::string ,
+   
std::shared_ptr logger)
+: _tmpFile(tmpFile),
+  _tmpFileOs(tmpFile, std::ios::binary),
+  _destFile(destFile),
+  logger_(logger) {
+}
+
+// Copy the entire file contents to the temporary file
+int64_t 
ProcessSession::ReadCallback::process(std::shared_ptr stream) {
+  // Copy file contents into tmp file
+  _writeSucceeded = false;
+  size_t size = 0;
+  uint8_t buffer[8192];
+  do {
+int read = stream->read(buffer, 8192);
+if (read < 0) {
+  return -1;
+}
+if (read == 0) {
+  break;
+}
+_tmpFileOs.write(reinterpret_cast(buffer), read);
+size += read;
+  } while (size < stream->getSize());
+  _writeSucceeded = true;
+  return size;
+}
+
+// Renames tmp file to final destination
+// Returns true if commit succeeded
+bool ProcessSession::ReadCallback::commit() {
+  bool success = false;
+
+  logger_->log_info("committing export operation to %s", 
_destFile.c_str());
+
+  if (_writeSucceeded) {
+_tmpFileOs.close();
+
+if (rename(_tmpFile.c_str(), _destFile.c_str())) {
+  logger_->log_info("commit export operation to %s failed because 
rename() call failed", _destFile.c_str());
+} else {
+  success = true;
+  logger_->log_info("commit export operation to %s succeeded", 
_destFile.c_str());
+}
+  } else {
+logger_->log_error("commit export operation to %s failed because write 
failed", _destFile.c_str());
+  }
+  return success;
+}
+
+// Clean up resources
+ProcessSession::ReadCallback::~ReadCallback() {
+  // Close tmp file
+  _tmpFileOs.close();
+
+  // Clean up tmp file, if necessary
+  unlink(_tmpFile.c_str());
+}
+
+
+void ProcessSession::stash(const std::string , 
std::shared_ptr flow) {
--- End diff --

What are the ramifications if power is lost and the rocksdb repos WAL 
causes us to repeat this flow? Would that previous tmp file be left around? 


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145687725
  
--- Diff: CMakeLists.txt ---
@@ -101,6 +101,7 @@ set(CIVETWEB_ENABLE_SSL_DYNAMIC_LOADING OFF CACHE BOOL 
"Disable dynamic SSL libr
 set(CIVETWEB_ENABLE_CXX ON CACHE BOOL "Enable civet C++ library")
 add_subdirectory(thirdparty/yaml-cpp-yaml-cpp-0.5.3)
 add_subdirectory(thirdparty/civetweb-1.9.1 EXCLUDE_FROM_ALL)
+add_subdirectory(thirdparty/libarchive-3.3.2)
--- End diff --

Can we move this into an extension and have this be excluded based on 
either an inclusion or exclusion?


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145690194
  
--- Diff: libminifi/include/processors/FocusArchiveEntry.h ---
@@ -0,0 +1,115 @@
+/**
+ * @file FocusArchiveEntry.h
+ * FocusArchiveEntry class declaration
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#ifndef LIBMINIFI_INCLUDE_PROCESSORS_FOCUSARCHIVEENTRY_H_
+#define LIBMINIFI_INCLUDE_PROCESSORS_FOCUSARCHIVEENTRY_H_
+
+#include 
+#include 
+#include 
+
+#include "FlowFileRecord.h"
+#include "core/Processor.h"
+#include "core/ProcessSession.h"
+#include "core/Core.h"
+#include "core/logging/LoggerConfiguration.h"
+#include "core/Resource.h"
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace processors {
+
+using logging::LoggerFactory;
+
+//! FocusArchiveEntry Class
+class FocusArchiveEntry : public core::Processor {
+ public:
+  //! Constructor
+  /*!
+   * Create a new processor
+   */
+  explicit FocusArchiveEntry(std::string name, uuid_t uuid = NULL)
+  : core::Processor(name, uuid),
+logger_(logging::LoggerFactory::getLogger()) {
+  }
+  //! Destructor
+  virtual ~FocusArchiveEntry()   {
+  }
+  //! Processor Name
+  static constexpr char const* ProcessorName = "FocusArchiveEntry";
+  //! Supported Properties
+  static core::Property Path;
+  //! Supported Relationships
+  static core::Relationship Success;
+
+  bool set_del_or_update_attr(std::shared_ptr, const 
std::string, std::string*) const;
--- End diff --

why std::string * over const std::string &?  The semantics imply you will 
be changing the third argument. 





---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145689143
  
--- Diff: libminifi/include/core/ProcessSession.h ---
@@ -151,11 +152,47 @@ class ProcessSession {
   bool keepSource,
   uint64_t offset, char inputDelimiter);
 
+  /**
+   * Exports the data stream to a file
+   * @param string file to export stream to
+   * @param flow flow file
+   * @param bool whether or not to keep the content in the flow file
+   */
+  bool exportContent(const std::string ,
--- End diff --

Does export content imply we will always have persistent storage? If not we 
could use std::shared_ptr stream for input and output and apply 
a file stream to them? 


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145688553
  
--- Diff: libminifi/include/core/FlowFile.h ---
@@ -50,6 +50,32 @@ class FlowFile : public core::Connectable {
   void clearResourceClaim();
 
   /**
+   * Returns a pointer to this flow file record's
+   * claim at the given stash key
+   */
+  std::shared_ptr getStashClaim(const std::string );
+
+  /**
+   * Sets the given stash key to the inbound claim argument
+   */
+  void setStashClaim(const std::string , 
std::shared_ptr );
--- End diff --

please use const on all shared ptrs 


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148#discussion_r145688253
  
--- Diff: LICENSE ---
@@ -534,4 +534,68 @@ ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 
OUT OF OR IN
 CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
 
+This projects includes libarchive bundle (https://www.libarchive.org)
+which is available under a BSD License by Tim Kientzle and others
+
+Copyright (c) 2003-2009 Tim Kientzle and other authors 
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+1. Redistributions of source code must retain the above copyright
+   notice, this list of conditions and the following disclaimer
+   in this position and unchanged.
+2. Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR
+IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT,
+INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 (END LICENSE TEXT)
+
+All libarchive C sources (including .c and .h files)
+and documentation files are subject to the copyright notice reproduced
+above.
+
+This libarchive includes below files 
+libarchive/archive_entry.c
+libarchive/archive_read_support_filter_compress.c
+libarchive/archive_write_add_filter_compress.c 
+which under a 3-clause UC Regents copyright as below
+/*-
+ * Copyright (c) 1993
+ *  The Regents of the University of California.  All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 4. Neither the name of the University nor the names of its contributors
--- End diff --

Is this a BSD-3 clause or a BSD-4 clause? 

BSD-4-clause may not be included as I understand it. 


---


[GitHub] nifi-minifi-cpp pull request #148: MINIFI-244 Un/FocusArchive processors

2017-10-19 Thread calebj
GitHub user calebj opened a pull request:

https://github.com/apache/nifi-minifi-cpp/pull/148

 MINIFI-244 Un/FocusArchive processors

Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? Is it referenced
 in the commit message?

- [x] Does your PR title start with MINIFI- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.

- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [x] If applicable, have you updated the LICENSE file?
- [x] If applicable, have you updated the NOTICE file?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.

Split into two commits to share a common base with #146.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NiFiLocal/nifi-minifi-cpp MINIFI-244-rc2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi-minifi-cpp/pull/148.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #148


commit 5cf2ae943e9eb2ab8c289fc25da451b96da38a04
Author: Caleb Johnson 
Date:   2017-10-18T18:21:29Z

Pull in MINIFICPP-72's libarchive

commit 3a32355c8f81e7216fca5ba8a83e9781aaeedb04
Author: Caleb Johnson 
Date:   2017-10-17T15:45:53Z

MINIFI-244 Un/FocusArchive processors




---