[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-quickstep/pull/291


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136199350
  
--- Diff: relational_operators/TextScanOperator.cpp ---
@@ -82,14 +81,19 @@ static bool ValidateTextScanTextSegmentSize(const char 
*flagname,
   return true;
 }
 
-static const volatile bool text_scan_text_segment_size_dummy = 
gflags::RegisterFlagValidator(
-_textscan_text_segment_size, );
+static const volatile bool text_scan_text_segment_size_dummy =
+gflags::RegisterFlagValidator(
+_textscan_text_segment_size, 
);
 
 namespace {
 
-size_t getFileSize(const string _name) {
+static std::size_t GetFileSize(const std::string _name) {
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198329
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -143,6 +145,45 @@ namespace E = ::quickstep::optimizer::expressions;
 namespace L = ::quickstep::optimizer::logical;
 namespace S = ::quickstep::serialization;
 
+namespace {
+
+static attribute_id GetAttributeIdFromName(
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136197963
  
--- Diff: query_optimizer/logical/LogicalType.hpp ---
@@ -34,6 +34,7 @@ namespace logical {
 enum class LogicalType {
   kAggregate,
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136197848
  
--- Diff: query_optimizer/logical/CopyTo.hpp ---
@@ -0,0 +1,141 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_QUERY_OPTIMIZER_LOGICAL_COPY_TO_HPP_
+#define QUICKSTEP_QUERY_OPTIMIZER_LOGICAL_COPY_TO_HPP_
+
+#include 
+#include 
+#include 
+
+#include "query_optimizer/OptimizerTree.hpp"
+#include "query_optimizer/expressions/AttributeReference.hpp"
+#include "query_optimizer/logical/Logical.hpp"
+#include "query_optimizer/logical/LogicalType.hpp"
+#include "utility/BulkIOConfiguration.hpp"
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+namespace quickstep {
+namespace optimizer {
+namespace logical {
+
+/** \addtogroup OptimizerLogical
+ *  @{
+ */
+
+class CopyTo;
+typedef std::shared_ptr CopyToPtr;
+
+/**
+ * @brief Represents an operation that copies data from a relation to a 
text file.
+ */
+class CopyTo : public Logical {
+ public:
+  LogicalType getLogicalType() const override {
+return LogicalType::kCopyTo;
+  }
+
+  std::string getName() const override {
+return "CopyTo";
+  }
+
+  /**
+   * @return The input relation whose data is to be exported.
+   */
+  const LogicalPtr& input() const {
+return input_;
+  }
+
+  /**
+   * @return The name of the file to write the data to.
+   */
+  const std::string& file_name() const {
+return file_name_;
+  }
+
+  /**
+   * @return The options for this COPY TO statement.
+   */
+  BulkIOConfigurationPtr options() const {
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136180815
  
--- Diff: query_optimizer/logical/CopyFrom.hpp ---
@@ -66,20 +67,14 @@ class CopyFrom : public Logical {
   const std::string& file_name() const { return file_name_; }
 
   /**
-   * @return The delimiter used in the text file to separate columns.
+   * @return The options for this COPY FROM statement.
*/
-  const char column_delimiter() const { return column_delimiter_; }
-
-  /**
-   * @return Whether to decode escape sequences in the text file.
-   */
-  bool escape_strings() const { return escape_strings_; }
+  BulkIOConfigurationPtr options() const { return options_; }
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136178849
  
--- Diff: parser/ParseStatement.hpp ---
@@ -771,122 +776,135 @@ class ParseStatementInsertSelection : public 
ParseStatementInsert {
   DISALLOW_COPY_AND_ASSIGN(ParseStatementInsertSelection);
 };
 
+
 /**
- * @brief Optional parameters for a COPY FROM statement.
+ * @brief The parsed representation of a COPY FROM/COPY TO statement.
  **/
-struct ParseCopyFromParams : public ParseTreeNode {
-  /**
-   * @brief Constructor, sets default values.
-   **/
-  ParseCopyFromParams(const int line_number, const int column_number)
-  : ParseTreeNode(line_number, column_number),
-escape_strings(true) {
-  }
-
-  std::string getName() const override { return "CopyFromParams"; }
-
+class ParseStatementCopy : public ParseStatement {
+ public:
   /**
-   * @brief Sets the column delimiter.
-   *
-   * @param delimiter_in The column delimiter string.
+   * @brief Copy direction (FROM text file/TO text file).
*/
-  void set_delimiter(ParseString* delimiter_in) {
-delimiter.reset(delimiter_in);
-  }
-
-  /**
-   * @brief The string which terminates individual attribute values in the
-   *input file. Can be NULL.
-   **/
-  std::unique_ptr delimiter;
+  enum CopyDirection {
+kFrom,
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136199193
  
--- Diff: relational_operators/TableExportOperator.hpp ---
@@ -0,0 +1,268 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_RELATIONAL_OPERATORS_TABLE_EXPORT_OPERATOR_HPP_
+#define QUICKSTEP_RELATIONAL_OPERATORS_TABLE_EXPORT_OPERATOR_HPP_
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "catalog/CatalogRelation.hpp"
+#include "catalog/CatalogTypedefs.hpp"
+#include "query_execution/QueryContext.hpp"
+#include "relational_operators/RelationalOperator.hpp"
+#include "relational_operators/WorkOrder.hpp"
+#include "storage/StorageBlockInfo.hpp"
+#include "threading/SpinMutex.hpp"
+#include "utility/BulkIOConfiguration.hpp"
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+#include "tmb/id_typedefs.h"
+
+namespace tmb { class MessageBus; }
+
+namespace quickstep {
+
+class CatalogRelationSchema;
+class StorageManager;
+class ValueAccessor;
+class WorkOrderProtosContainer;
+class WorkOrdersContainer;
+
+namespace serialization { class WorkOrder; }
+
+/** \addtogroup RelationalOperators
+ *  @{
+ */
+
+class TableExportOperator : public RelationalOperator {
+ public:
+  /**
+   * @brief Feedback message to Foreman when a 
TableExportToStringWorkOrder has
+   *completed writing a block to the string buffer.
+   */
+  enum FeedbackMessageType : WorkOrder::FeedbackMessageType {
+  kBlockOutputMessage,
+  };
+
+  /**
+   * @brief Constructor.
+   *
+   * @param query_id The ID of the query to which this operator belongs.
+   * @param input_relation The relation to export.
+   * @param input_relation_is_stored If input_relation is a stored 
relation and
+   *is fully available to the operator before it can start 
generating
+   *workorders.
+   * @param file_name The name of the file to export the relation to.
+   * @param options The options that specify the detailed format of the 
output
+   *file.
+   */
+  TableExportOperator(const std::size_t query_id,
+  const CatalogRelation _relation,
+  const bool input_relation_is_stored,
+  const std::string _name,
+  const BulkIOConfigurationPtr )
+  : RelationalOperator(query_id),
+input_relation_(input_relation),
+input_relation_is_stored_(input_relation_is_stored),
+file_name_(file_name),
+options_(options),
+input_relation_block_ids_(input_relation_is_stored
+  ? input_relation.getBlocksSnapshot()
+  : std::vector()),
+num_workorders_generated_(0),
+started_(false),
+num_blocks_written_(0),
+file_(nullptr) {}
+
+  ~TableExportOperator() override {}
+
+  OperatorType getOperatorType() const override {
+return kTableExport;
+  }
+
+  std::string getName() const override {
+return "TableExportOperator";
+  }
+
+  /**
+   * @return The relation to export.
+   */
+  const CatalogRelation& input_relation() const {
+return input_relation_;
+  }
+
+  bool getAllWorkOrders(WorkOrdersContainer *container,
+QueryContext *query_context,
+StorageManager *storage_manager,
+const tmb::client_id scheduler_client_id,
+tmb::MessageBus *bus) override;
+
+  bool getAllWorkOrderProtos(WorkOrderProtosContainer *container) override;
+
+  void feedInputBlock(const block_id input_block_id,
+  const relation_id input_relation_id,

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198182
  
--- Diff: query_optimizer/physical/CopyFrom.hpp ---
@@ -68,22 +69,14 @@ class CopyFrom : public Physical {
   const std::string& file_name() const { return file_name_; }
 
   /**
-   * @return The delimiter used in the text file to separate columns.
+   * @return The options for this COPY FROM statement.
*/
-  const char column_delimiter() const { return column_delimiter_; }
-
-  /**
-   * @return Whether to decode escape sequences in the text file.
-   */
-  bool escape_strings() const { return escape_strings_; }
+  BulkIOConfigurationPtr options() const { return options_; }
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198572
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
+  THROW_SQL_ERROR_AT(parse_delimiter)
+  << "DELIMITER is not a single character";
+}
+options->setDelimiter(delimiter.front());
+  } else if (key == "escape_strings") {
+options->setEscapeStrings(GetKeyValueBool(param));
+  } else if (key != "format") {
+THROW_SQL_ERROR_AT() << "Unsupported copy option: " << key;
   }
 }
-escape_strings_ = params->escape_strings;
   }
 
   return 
L::CopyFrom::Create(resolveRelationName(copy_from_statement.relation_name()),
- 
copy_from_statement.source_filename()->value(),
- column_delimiter_[0],
- escape_strings_);
+ copy_from_statement.file_name()->value(),
+ BulkIOConfigurationPtr(options.release()));
+}
+
+L::LogicalPtr Resolver::resolveCopyTo(
+const ParseStatementCopy _to_statement) {
+  DCHECK(copy_to_statement.getCopyDirection() == ParseStatementCopy::kTo);
+  const PtrList *params = copy_to_statement.params();
+
+  // Check if copy format is explicitly specified.
+  BulkIOFormat file_format = BulkIOFormat::kText;
+  bool format_specified = false;
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+if (format == "csv") {
+  file_format = BulkIOFormat::kCSV;
+} else if (format == "text") {
+  file_format = BulkIOFormat::kText;
+} else {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+format_specified = true;
+break;
+  }
+}
+  }
+
+  const std::string _name = copy_to_statement.file_name()->value();
+  if (file_name.length() <= 1) {
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198255
  
--- Diff: query_optimizer/physical/CopyTo.hpp ---
@@ -0,0 +1,147 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_QUERY_OPTIMIZER_PHYSICAL_COPY_TO_HPP_
+#define QUICKSTEP_QUERY_OPTIMIZER_PHYSICAL_COPY_TO_HPP_
+
+#include 
+#include 
+#include 
+
+#include "query_optimizer/OptimizerTree.hpp"
+#include "query_optimizer/expressions/AttributeReference.hpp"
+#include "query_optimizer/physical/Physical.hpp"
+#include "query_optimizer/physical/PhysicalType.hpp"
+#include "utility/BulkIOConfiguration.hpp"
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+namespace quickstep {
+namespace optimizer {
+namespace physical {
+
+/** \addtogroup OptimizerPhysical
+ *  @{
+ */
+
+class CopyTo;
+typedef std::shared_ptr CopyToPtr;
+
+/**
+ * @brief Represents an operation that copies data from a relation to a 
text file.
+ */
+class CopyTo : public Physical {
+ public:
+  PhysicalType getPhysicalType() const override {
+return PhysicalType::kCopyTo;
+  }
+
+  std::string getName() const override {
+return "CopyTo";
+  }
+
+  /**
+   * @return The input relation whose data is to be exported.
+   */
+  const PhysicalPtr& input() const {
+return input_;
+  }
+
+  /**
+   * @return The name of the file to write the data to.
+   */
+  const std::string& file_name() const {
+return file_name_;
+  }
+
+  /**
+   * @return The options for this COPY TO statement.
+   */
+  BulkIOConfigurationPtr options() const {
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198306
  
--- Diff: query_optimizer/physical/PhysicalType.hpp ---
@@ -34,6 +34,7 @@ namespace physical {
 enum class PhysicalType {
   kAggregate,
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198462
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198547
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136199047
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -1595,6 +1742,19 @@ void 
Resolver::appendProjectIfNeedPrecomputationAfterAggregation(
   }
 }
 
+void Resolver::reportIfWithClauseUnused(
+const PtrVector _list) const {
+  if (!with_queries_info_.unreferenced_query_indexes.empty()) {
+int unreferenced_with_query_index = 
*with_queries_info_.unreferenced_query_indexes.begin();
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136199940
  
--- Diff: utility/ExecutionDAGVisualizer.cpp ---
@@ -55,12 +55,15 @@ using std::to_string;
 namespace quickstep {
 
 DEFINE_bool(visualize_execution_dag_partition_info, false,
-"If true, display the operator partition info in the 
visualized execution plan DAG."
-"Valid iif 'visualize_execution_dag' turns on.");
+"If true, display the operator partition info in the 
visualized "
+"execution plan DAG. Valid if 'visualize_execution_dag' turns 
on.");
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136199148
  
--- Diff: relational_operators/TableExportOperator.hpp ---
@@ -0,0 +1,268 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_RELATIONAL_OPERATORS_TABLE_EXPORT_OPERATOR_HPP_
+#define QUICKSTEP_RELATIONAL_OPERATORS_TABLE_EXPORT_OPERATOR_HPP_
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "catalog/CatalogRelation.hpp"
+#include "catalog/CatalogTypedefs.hpp"
+#include "query_execution/QueryContext.hpp"
+#include "relational_operators/RelationalOperator.hpp"
+#include "relational_operators/WorkOrder.hpp"
+#include "storage/StorageBlockInfo.hpp"
+#include "threading/SpinMutex.hpp"
+#include "utility/BulkIOConfiguration.hpp"
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+#include "tmb/id_typedefs.h"
+
+namespace tmb { class MessageBus; }
+
+namespace quickstep {
+
+class CatalogRelationSchema;
+class StorageManager;
+class ValueAccessor;
+class WorkOrderProtosContainer;
+class WorkOrdersContainer;
+
+namespace serialization { class WorkOrder; }
+
+/** \addtogroup RelationalOperators
+ *  @{
+ */
+
+class TableExportOperator : public RelationalOperator {
+ public:
+  /**
+   * @brief Feedback message to Foreman when a 
TableExportToStringWorkOrder has
+   *completed writing a block to the string buffer.
+   */
+  enum FeedbackMessageType : WorkOrder::FeedbackMessageType {
+  kBlockOutputMessage,
+  };
+
+  /**
+   * @brief Constructor.
+   *
+   * @param query_id The ID of the query to which this operator belongs.
+   * @param input_relation The relation to export.
+   * @param input_relation_is_stored If input_relation is a stored 
relation and
+   *is fully available to the operator before it can start 
generating
+   *workorders.
+   * @param file_name The name of the file to export the relation to.
+   * @param options The options that specify the detailed format of the 
output
+   *file.
+   */
+  TableExportOperator(const std::size_t query_id,
+  const CatalogRelation _relation,
+  const bool input_relation_is_stored,
+  const std::string _name,
+  const BulkIOConfigurationPtr )
+  : RelationalOperator(query_id),
+input_relation_(input_relation),
+input_relation_is_stored_(input_relation_is_stored),
+file_name_(file_name),
+options_(options),
+input_relation_block_ids_(input_relation_is_stored
+  ? input_relation.getBlocksSnapshot()
+  : std::vector()),
+num_workorders_generated_(0),
+started_(false),
+num_blocks_written_(0),
+file_(nullptr) {}
+
+  ~TableExportOperator() override {}
+
+  OperatorType getOperatorType() const override {
+return kTableExport;
+  }
+
+  std::string getName() const override {
+return "TableExportOperator";
+  }
+
+  /**
+   * @return The relation to export.
+   */
+  const CatalogRelation& input_relation() const {
+return input_relation_;
+  }
+
+  bool getAllWorkOrders(WorkOrdersContainer *container,
+QueryContext *query_context,
+StorageManager *storage_manager,
+const tmb::client_id scheduler_client_id,
+tmb::MessageBus *bus) override;
+
+  bool getAllWorkOrderProtos(WorkOrderProtosContainer *container) override;
+
+  void feedInputBlock(const block_id input_block_id,
+  const relation_id input_relation_id,

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136198676
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
+  THROW_SQL_ERROR_AT(parse_delimiter)
+  << "DELIMITER is not a single character";
+}
+options->setDelimiter(delimiter.front());
+  } else if (key == "escape_strings") {
+options->setEscapeStrings(GetKeyValueBool(param));
+  } else if (key != "format") {
+THROW_SQL_ERROR_AT() << "Unsupported copy option: " << key;
   }
 }
-escape_strings_ = params->escape_strings;
   }
 
   return 
L::CopyFrom::Create(resolveRelationName(copy_from_statement.relation_name()),
- 
copy_from_statement.source_filename()->value(),
- column_delimiter_[0],
- escape_strings_);
+ copy_from_statement.file_name()->value(),
+ BulkIOConfigurationPtr(options.release()));
+}
+
+L::LogicalPtr Resolver::resolveCopyTo(
+const ParseStatementCopy _to_statement) {
+  DCHECK(copy_to_statement.getCopyDirection() == ParseStatementCopy::kTo);
+  const PtrList *params = copy_to_statement.params();
+
+  // Check if copy format is explicitly specified.
+  BulkIOFormat file_format = BulkIOFormat::kText;
+  bool format_specified = false;
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+if (format == "csv") {
+  file_format = BulkIOFormat::kCSV;
+} else if (format == "text") {
+  file_format = BulkIOFormat::kText;
+} else {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+format_specified = true;
+break;
+  }
+}
+  }
+
+  const std::string _name = copy_to_statement.file_name()->value();
+  if (file_name.length() <= 1) {
+THROW_SQL_ERROR_AT(copy_to_statement.file_name())
+<< "File name can not be empty";
+  }
+
+  // Infer copy format from file name extension.
+  if (!format_specified) {
+if (file_name.length() > 4) {
+  if (ToLower(file_name.substr(file_name.length() - 4)) == ".csv") {
+file_format = BulkIOFormat::kCSV;
+  }
+}
+  }
+
+  // Resolve the copy options.
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136199029
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
+  THROW_SQL_ERROR_AT(parse_delimiter)
+  << "DELIMITER is not a single character";
+}
+options->setDelimiter(delimiter.front());
+  } else if (key == "escape_strings") {
+options->setEscapeStrings(GetKeyValueBool(param));
+  } else if (key != "format") {
+THROW_SQL_ERROR_AT() << "Unsupported copy option: " << key;
   }
 }
-escape_strings_ = params->escape_strings;
   }
 
   return 
L::CopyFrom::Create(resolveRelationName(copy_from_statement.relation_name()),
- 
copy_from_statement.source_filename()->value(),
- column_delimiter_[0],
- escape_strings_);
+ copy_from_statement.file_name()->value(),
+ BulkIOConfigurationPtr(options.release()));
+}
+
+L::LogicalPtr Resolver::resolveCopyTo(
+const ParseStatementCopy _to_statement) {
+  DCHECK(copy_to_statement.getCopyDirection() == ParseStatementCopy::kTo);
+  const PtrList *params = copy_to_statement.params();
+
+  // Check if copy format is explicitly specified.
+  BulkIOFormat file_format = BulkIOFormat::kText;
+  bool format_specified = false;
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+if (format == "csv") {
+  file_format = BulkIOFormat::kCSV;
+} else if (format == "text") {
+  file_format = BulkIOFormat::kText;
+} else {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+format_specified = true;
+break;
+  }
+}
+  }
+
+  const std::string _name = copy_to_statement.file_name()->value();
+  if (file_name.length() <= 1) {
+THROW_SQL_ERROR_AT(copy_to_statement.file_name())
+<< "File name can not be empty";
+  }
+
+  // Infer copy format from file name extension.
+  if (!format_specified) {
+if (file_name.length() > 4) {
+  if (ToLower(file_name.substr(file_name.length() - 4)) == ".csv") {
+file_format = BulkIOFormat::kCSV;
+  }
+}
+  }
+
+  // Resolve the copy options.
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136178749
  
--- Diff: parser/ParseStatement.hpp ---
@@ -60,16 +60,16 @@ class ParseStatement : public ParseTreeNode {
* @brief The possible types of SQL statements.
**/
   enum StatementType {
-kCreateTable,
+kCommand,
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
Github user jianqiao commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136175863
  
--- Diff: parser/ParseKeyValue.hpp ---
@@ -37,14 +37,15 @@ namespace quickstep {
  */
 class ParseKeyValue : public ParseTreeNode {
  public:
-  enum class KeyValueType {
+  enum KeyValueType {
+kStringBool,
--- End diff --

Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136171013
  
--- Diff: relational_operators/TableExportOperator.cpp ---
@@ -0,0 +1,329 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#include "relational_operators/TableExportOperator.hpp"
+
+#include 
+#include 
+#include 
+#include 
+
+#include "catalog/CatalogAttribute.hpp"
+#include "query_execution/QueryContext.hpp"
+#include "query_execution/WorkOrderProtosContainer.hpp"
+#include "query_execution/WorkOrdersContainer.hpp"
+#include "relational_operators/WorkOrder.pb.h"
+#include "storage/StorageBlockInfo.hpp"
+#include "storage/ValueAccessor.hpp"
+#include "threading/SpinMutex.hpp"
+#include "types/TypedValue.hpp"
+#include "types/containers/Tuple.hpp"
+#include "utility/BulkIOConfiguration.hpp"
+#include "utility/StringUtil.hpp"
+
+#include "glog/logging.h"
+
+#include "tmb/id_typedefs.h"
+
+namespace quickstep {
+
+bool TableExportOperator::getAllWorkOrders(
+WorkOrdersContainer *container,
+QueryContext *query_context,
+StorageManager *storage_manager,
+const tmb::client_id scheduler_client_id,
+tmb::MessageBus *bus) {
+  const auto add_work_order = [&](const block_id input_block_id,
+  const bool is_first_work_order) -> void {
+std::unique_ptr output_buffer = 
std::make_unique();
+container->addNormalWorkOrder(
+new TableExportToStringWorkOrder(query_id_,
+ input_relation_,
+ input_block_id,
+ options_->getFormat(),
+ is_first_work_order && 
options_->hasHeader(),
+ options_->getDelimiter(),
+ options_->escapeStrings(),
+ options_->getQuoteCharacter(),
+ options_->getNullString(),
+ output_buffer.get(),
+ op_index_,
+ scheduler_client_id,
+ storage_manager,
+ bus),
+op_index_);
+
+SpinMutexLock lock(output_buffers_mutex_);
+output_buffers_.emplace(input_block_id, output_buffer.release());
+  };
+
+  if (input_relation_is_stored_) {
+if (!started_) {
+  for (std::size_t i = 0; i < input_relation_block_ids_.size(); ++i) {
+add_work_order(input_relation_block_ids_[i], i == 0);
+  }
+  num_workorders_generated_ = input_relation_block_ids_.size();
+  started_ = true;
+}
+return true;
+  } else {
+while (num_workorders_generated_ < input_relation_block_ids_.size()) {
+  add_work_order(input_relation_block_ids_[num_workorders_generated_],
+ num_workorders_generated_ == 0);
+  ++num_workorders_generated_;
+}
+return done_feeding_input_relation_;
+  }
+}
+
+bool TableExportOperator::getAllWorkOrderProtos(
+WorkOrderProtosContainer *container) {
+  // TODO(quickstep-team): Implement TextExportOperator for the 
distributed case.
+  LOG(FATAL) << "TableExportOperator::getAllWorkOrderProtos() is not 
supported";
+}
+
+void TableExportOperator::receiveFeedbackMessage(
+const WorkOrder::FeedbackMessage ) {
+  DCHECK(TableExportOperator::kBlockOutputMessage == msg.type());
+  DCHECK(msg.payload_size() == sizeof(block_id));
+
+  if (file_ == nullptr) {
+const std::string lo_file_name = ToLower(file_name_);
+if (lo_file_name == "$stdout") {
+  file_ = stdout;
+} else if (lo_file_name 

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136162438
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
+  THROW_SQL_ERROR_AT(parse_delimiter)
+  << "DELIMITER is not a single character";
+}
+options->setDelimiter(delimiter.front());
+  } else if (key == "escape_strings") {
+options->setEscapeStrings(GetKeyValueBool(param));
+  } else if (key != "format") {
+THROW_SQL_ERROR_AT() << "Unsupported copy option: " << key;
   }
 }
-escape_strings_ = params->escape_strings;
   }
 
   return 
L::CopyFrom::Create(resolveRelationName(copy_from_statement.relation_name()),
- 
copy_from_statement.source_filename()->value(),
- column_delimiter_[0],
- escape_strings_);
+ copy_from_statement.file_name()->value(),
+ BulkIOConfigurationPtr(options.release()));
+}
+
+L::LogicalPtr Resolver::resolveCopyTo(
+const ParseStatementCopy _to_statement) {
+  DCHECK(copy_to_statement.getCopyDirection() == ParseStatementCopy::kTo);
+  const PtrList *params = copy_to_statement.params();
+
+  // Check if copy format is explicitly specified.
+  BulkIOFormat file_format = BulkIOFormat::kText;
+  bool format_specified = false;
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+if (format == "csv") {
+  file_format = BulkIOFormat::kCSV;
+} else if (format == "text") {
+  file_format = BulkIOFormat::kText;
+} else {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+format_specified = true;
+break;
+  }
+}
+  }
+
+  const std::string _name = copy_to_statement.file_name()->value();
+  if (file_name.length() <= 1) {
+THROW_SQL_ERROR_AT(copy_to_statement.file_name())
+<< "File name can not be empty";
+  }
+
+  // Infer copy format from file name extension.
+  if (!format_specified) {
+if (file_name.length() > 4) {
--- End diff --

Should we change to `4u` to avoid unsign-vs-sign-compare warning?

Btw, do we want to support a file name `.csv`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled 

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136148771
  
--- Diff: parser/ParseStatement.hpp ---
@@ -771,122 +776,135 @@ class ParseStatementInsertSelection : public 
ParseStatementInsert {
   DISALLOW_COPY_AND_ASSIGN(ParseStatementInsertSelection);
 };
 
+
 /**
- * @brief Optional parameters for a COPY FROM statement.
+ * @brief The parsed representation of a COPY FROM/COPY TO statement.
  **/
-struct ParseCopyFromParams : public ParseTreeNode {
-  /**
-   * @brief Constructor, sets default values.
-   **/
-  ParseCopyFromParams(const int line_number, const int column_number)
-  : ParseTreeNode(line_number, column_number),
-escape_strings(true) {
-  }
-
-  std::string getName() const override { return "CopyFromParams"; }
-
+class ParseStatementCopy : public ParseStatement {
+ public:
   /**
-   * @brief Sets the column delimiter.
-   *
-   * @param delimiter_in The column delimiter string.
+   * @brief Copy direction (FROM text file/TO text file).
*/
-  void set_delimiter(ParseString* delimiter_in) {
-delimiter.reset(delimiter_in);
-  }
-
-  /**
-   * @brief The string which terminates individual attribute values in the
-   *input file. Can be NULL.
-   **/
-  std::unique_ptr delimiter;
+  enum CopyDirection {
+kFrom,
--- End diff --

Change to `kFrom = 0,`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136173360
  
--- Diff: relational_operators/TextScanOperator.cpp ---
@@ -82,14 +81,19 @@ static bool ValidateTextScanTextSegmentSize(const char 
*flagname,
   return true;
 }
 
-static const volatile bool text_scan_text_segment_size_dummy = 
gflags::RegisterFlagValidator(
-_textscan_text_segment_size, );
+static const volatile bool text_scan_text_segment_size_dummy =
+gflags::RegisterFlagValidator(
+_textscan_text_segment_size, 
);
 
 namespace {
 
-size_t getFileSize(const string _name) {
+static std::size_t GetFileSize(const std::string _name) {
--- End diff --

Remove `static`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136164669
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
+  THROW_SQL_ERROR_AT(parse_delimiter)
+  << "DELIMITER is not a single character";
+}
+options->setDelimiter(delimiter.front());
+  } else if (key == "escape_strings") {
+options->setEscapeStrings(GetKeyValueBool(param));
+  } else if (key != "format") {
+THROW_SQL_ERROR_AT() << "Unsupported copy option: " << key;
   }
 }
-escape_strings_ = params->escape_strings;
   }
 
   return 
L::CopyFrom::Create(resolveRelationName(copy_from_statement.relation_name()),
- 
copy_from_statement.source_filename()->value(),
- column_delimiter_[0],
- escape_strings_);
+ copy_from_statement.file_name()->value(),
+ BulkIOConfigurationPtr(options.release()));
+}
+
+L::LogicalPtr Resolver::resolveCopyTo(
+const ParseStatementCopy _to_statement) {
+  DCHECK(copy_to_statement.getCopyDirection() == ParseStatementCopy::kTo);
+  const PtrList *params = copy_to_statement.params();
+
+  // Check if copy format is explicitly specified.
+  BulkIOFormat file_format = BulkIOFormat::kText;
+  bool format_specified = false;
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+if (format == "csv") {
+  file_format = BulkIOFormat::kCSV;
+} else if (format == "text") {
+  file_format = BulkIOFormat::kText;
+} else {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+format_specified = true;
+break;
+  }
+}
+  }
+
+  const std::string _name = copy_to_statement.file_name()->value();
+  if (file_name.length() <= 1) {
+THROW_SQL_ERROR_AT(copy_to_statement.file_name())
+<< "File name can not be empty";
+  }
+
+  // Infer copy format from file name extension.
+  if (!format_specified) {
+if (file_name.length() > 4) {
+  if (ToLower(file_name.substr(file_name.length() - 4)) == ".csv") {
+file_format = BulkIOFormat::kCSV;
+  }
+}
+  }
+
+  // Resolve the copy options.
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+   

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136158846
  
--- Diff: query_optimizer/physical/CopyTo.hpp ---
@@ -0,0 +1,147 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_QUERY_OPTIMIZER_PHYSICAL_COPY_TO_HPP_
+#define QUICKSTEP_QUERY_OPTIMIZER_PHYSICAL_COPY_TO_HPP_
+
+#include 
+#include 
+#include 
+
+#include "query_optimizer/OptimizerTree.hpp"
+#include "query_optimizer/expressions/AttributeReference.hpp"
+#include "query_optimizer/physical/Physical.hpp"
+#include "query_optimizer/physical/PhysicalType.hpp"
+#include "utility/BulkIOConfiguration.hpp"
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+namespace quickstep {
+namespace optimizer {
+namespace physical {
+
+/** \addtogroup OptimizerPhysical
+ *  @{
+ */
+
+class CopyTo;
+typedef std::shared_ptr CopyToPtr;
+
+/**
+ * @brief Represents an operation that copies data from a relation to a 
text file.
+ */
+class CopyTo : public Physical {
+ public:
+  PhysicalType getPhysicalType() const override {
+return PhysicalType::kCopyTo;
+  }
+
+  std::string getName() const override {
+return "CopyTo";
+  }
+
+  /**
+   * @return The input relation whose data is to be exported.
+   */
+  const PhysicalPtr& input() const {
+return input_;
+  }
+
+  /**
+   * @return The name of the file to write the data to.
+   */
+  const std::string& file_name() const {
+return file_name_;
+  }
+
+  /**
+   * @return The options for this COPY TO statement.
+   */
+  BulkIOConfigurationPtr options() const {
--- End diff --

Return a const ref.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136158363
  
--- Diff: query_optimizer/logical/CopyTo.hpp ---
@@ -0,0 +1,141 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_QUERY_OPTIMIZER_LOGICAL_COPY_TO_HPP_
+#define QUICKSTEP_QUERY_OPTIMIZER_LOGICAL_COPY_TO_HPP_
+
+#include 
+#include 
+#include 
+
+#include "query_optimizer/OptimizerTree.hpp"
+#include "query_optimizer/expressions/AttributeReference.hpp"
+#include "query_optimizer/logical/Logical.hpp"
+#include "query_optimizer/logical/LogicalType.hpp"
+#include "utility/BulkIOConfiguration.hpp"
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+namespace quickstep {
+namespace optimizer {
+namespace logical {
+
+/** \addtogroup OptimizerLogical
+ *  @{
+ */
+
+class CopyTo;
+typedef std::shared_ptr CopyToPtr;
+
+/**
+ * @brief Represents an operation that copies data from a relation to a 
text file.
+ */
+class CopyTo : public Logical {
+ public:
+  LogicalType getLogicalType() const override {
+return LogicalType::kCopyTo;
+  }
+
+  std::string getName() const override {
+return "CopyTo";
+  }
+
+  /**
+   * @return The input relation whose data is to be exported.
+   */
+  const LogicalPtr& input() const {
+return input_;
+  }
+
+  /**
+   * @return The name of the file to write the data to.
+   */
+  const std::string& file_name() const {
+return file_name_;
+  }
+
+  /**
+   * @return The options for this COPY TO statement.
+   */
+  BulkIOConfigurationPtr options() const {
--- End diff --

Return a const ref.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136158497
  
--- Diff: query_optimizer/logical/LogicalType.hpp ---
@@ -34,6 +34,7 @@ namespace logical {
 enum class LogicalType {
   kAggregate,
--- End diff --

Change to `kAggregate = 0,`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136149449
  
--- Diff: parser/ParseStatement.hpp ---
@@ -898,24 +916,53 @@ class ParseStatementCopyFrom : public ParseStatement {
   std::vector *non_container_child_fields,
   std::vector *container_child_field_names,
   std::vector 
*container_child_fields) const override {
-inline_field_names->push_back("relation_name");
-inline_field_values->push_back(relation_name_->value());
+inline_field_names->push_back("direction");
+inline_field_values->push_back(direction_ == kFrom ? "FROM" : "TO");
 
-inline_field_names->push_back("source_file");
-inline_field_values->push_back(source_filename_->value());
+inline_field_names->push_back("file");
+inline_field_values->push_back(file_name_->value());
+
+if (relation_name_ != nullptr) {
+  inline_field_names->push_back("relation_name");
+  inline_field_values->push_back(relation_name_->value());
+}
+
+if (set_operation_query_ != nullptr) {
+  non_container_child_field_names->push_back("set_operation_query");
+  non_container_child_fields->push_back(set_operation_query_.get());
+}
+
+if (with_clause_ != nullptr && !with_clause_->empty()) {
+  container_child_field_names->push_back("with_clause");
+  container_child_fields->emplace_back();
+  for (const ParseSubqueryTableReference _subquery : 
*with_clause_) {
+container_child_fields->back().push_back(_subquery);
+  }
+}
 
 if (params_ != nullptr) {
-  non_container_child_field_names->push_back("params");
-  non_container_child_fields->push_back(params_.get());
+  container_child_field_names->push_back("params");
+  container_child_fields->emplace_back();
+  for (const ParseKeyValue  : *params_) {
+container_child_fields->back().push_back();
+  }
 }
   }
 
  private:
+  const CopyDirection direction_;
+
+  // NOTE(jianqiao):
+  // (1) Either relation_name_ or set_operation_query_ has non-null value.
+  // (2) set_operation_query_ must be null for COPY FROM statement.
--- End diff --

Question: do we have these checks enforced in the constructors?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136165286
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -1595,6 +1742,19 @@ void 
Resolver::appendProjectIfNeedPrecomputationAfterAggregation(
   }
 }
 
+void Resolver::reportIfWithClauseUnused(
+const PtrVector _list) const {
+  if (!with_queries_info_.unreferenced_query_indexes.empty()) {
+int unreferenced_with_query_index = 
*with_queries_info_.unreferenced_query_indexes.begin();
--- End diff --

Mark `const`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136161344
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
--- End diff --

How about `auto options = 
std::make_shared(file_format);`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136162397
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
+  THROW_SQL_ERROR_AT(parse_delimiter)
+  << "DELIMITER is not a single character";
+}
+options->setDelimiter(delimiter.front());
+  } else if (key == "escape_strings") {
+options->setEscapeStrings(GetKeyValueBool(param));
+  } else if (key != "format") {
+THROW_SQL_ERROR_AT() << "Unsupported copy option: " << key;
   }
 }
-escape_strings_ = params->escape_strings;
   }
 
   return 
L::CopyFrom::Create(resolveRelationName(copy_from_statement.relation_name()),
- 
copy_from_statement.source_filename()->value(),
- column_delimiter_[0],
- escape_strings_);
+ copy_from_statement.file_name()->value(),
+ BulkIOConfigurationPtr(options.release()));
+}
+
+L::LogicalPtr Resolver::resolveCopyTo(
+const ParseStatementCopy _to_statement) {
+  DCHECK(copy_to_statement.getCopyDirection() == ParseStatementCopy::kTo);
+  const PtrList *params = copy_to_statement.params();
+
+  // Check if copy format is explicitly specified.
+  BulkIOFormat file_format = BulkIOFormat::kText;
+  bool format_specified = false;
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+if (format == "csv") {
+  file_format = BulkIOFormat::kCSV;
+} else if (format == "text") {
+  file_format = BulkIOFormat::kText;
+} else {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+format_specified = true;
+break;
+  }
+}
+  }
+
+  const std::string _name = copy_to_statement.file_name()->value();
+  if (file_name.length() <= 1) {
--- End diff --

Should we change to `1u` to avoid unsign-vs-sign-compare warning?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136165036
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -143,6 +145,45 @@ namespace E = ::quickstep::optimizer::expressions;
 namespace L = ::quickstep::optimizer::logical;
 namespace S = ::quickstep::serialization;
 
+namespace {
+
+static attribute_id GetAttributeIdFromName(
--- End diff --

We don't need `static` key word for methods in an anonymous namespace.

Ditto for below.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136154199
  
--- Diff: utility/BulkIOConfiguration.hpp ---
@@ -0,0 +1,198 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_UTILITY_BULK_IO_CONFIGURATION_HPP_
+#define QUICKSTEP_UTILITY_BULK_IO_CONFIGURATION_HPP_
+
+#include 
+#include 
+
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+namespace quickstep {
+
+/**
+ * @brief External file format for bulk I/O.
+ */
+enum class BulkIOFormat {
+  kCSV,
+  kText
+};
+
+class BulkIOConfiguration;
+typedef std::shared_ptr BulkIOConfigurationPtr;
+
+/**
+ * @brief Detailed file format configuration for bulk I/O (i.e. COPY 
operations)
+ *that moves data between Quickstep tables and external files.
+ */
+class BulkIOConfiguration {
--- End diff --

Change to `BulkIoConfiguration`.

See [code 
style](https://google.github.io/styleguide/cppguide.html#Type_Names) example 
`UrlTable`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136162361
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
--- End diff --

Should we change to `1u` to avoid unsign-vs-sign-compare warning?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136172992
  
--- Diff: relational_operators/TableExportOperator.hpp ---
@@ -0,0 +1,268 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_RELATIONAL_OPERATORS_TABLE_EXPORT_OPERATOR_HPP_
+#define QUICKSTEP_RELATIONAL_OPERATORS_TABLE_EXPORT_OPERATOR_HPP_
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "catalog/CatalogRelation.hpp"
+#include "catalog/CatalogTypedefs.hpp"
+#include "query_execution/QueryContext.hpp"
+#include "relational_operators/RelationalOperator.hpp"
+#include "relational_operators/WorkOrder.hpp"
+#include "storage/StorageBlockInfo.hpp"
+#include "threading/SpinMutex.hpp"
+#include "utility/BulkIOConfiguration.hpp"
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+#include "tmb/id_typedefs.h"
+
+namespace tmb { class MessageBus; }
+
+namespace quickstep {
+
+class CatalogRelationSchema;
+class StorageManager;
+class ValueAccessor;
+class WorkOrderProtosContainer;
+class WorkOrdersContainer;
+
+namespace serialization { class WorkOrder; }
+
+/** \addtogroup RelationalOperators
+ *  @{
+ */
+
+class TableExportOperator : public RelationalOperator {
+ public:
+  /**
+   * @brief Feedback message to Foreman when a 
TableExportToStringWorkOrder has
+   *completed writing a block to the string buffer.
+   */
+  enum FeedbackMessageType : WorkOrder::FeedbackMessageType {
+  kBlockOutputMessage,
+  };
+
+  /**
+   * @brief Constructor.
+   *
+   * @param query_id The ID of the query to which this operator belongs.
+   * @param input_relation The relation to export.
+   * @param input_relation_is_stored If input_relation is a stored 
relation and
+   *is fully available to the operator before it can start 
generating
+   *workorders.
+   * @param file_name The name of the file to export the relation to.
+   * @param options The options that specify the detailed format of the 
output
+   *file.
+   */
+  TableExportOperator(const std::size_t query_id,
+  const CatalogRelation _relation,
+  const bool input_relation_is_stored,
+  const std::string _name,
+  const BulkIOConfigurationPtr )
+  : RelationalOperator(query_id),
+input_relation_(input_relation),
+input_relation_is_stored_(input_relation_is_stored),
+file_name_(file_name),
+options_(options),
+input_relation_block_ids_(input_relation_is_stored
+  ? input_relation.getBlocksSnapshot()
+  : std::vector()),
+num_workorders_generated_(0),
+started_(false),
+num_blocks_written_(0),
+file_(nullptr) {}
+
+  ~TableExportOperator() override {}
+
+  OperatorType getOperatorType() const override {
+return kTableExport;
+  }
+
+  std::string getName() const override {
+return "TableExportOperator";
+  }
+
+  /**
+   * @return The relation to export.
+   */
+  const CatalogRelation& input_relation() const {
+return input_relation_;
+  }
+
+  bool getAllWorkOrders(WorkOrdersContainer *container,
+QueryContext *query_context,
+StorageManager *storage_manager,
+const tmb::client_id scheduler_client_id,
+tmb::MessageBus *bus) override;
+
+  bool getAllWorkOrderProtos(WorkOrderProtosContainer *container) override;
+
+  void feedInputBlock(const block_id input_block_id,
+  const relation_id input_relation_id,
+   

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136173116
  
--- Diff: relational_operators/TableExportOperator.hpp ---
@@ -0,0 +1,268 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_RELATIONAL_OPERATORS_TABLE_EXPORT_OPERATOR_HPP_
+#define QUICKSTEP_RELATIONAL_OPERATORS_TABLE_EXPORT_OPERATOR_HPP_
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "catalog/CatalogRelation.hpp"
+#include "catalog/CatalogTypedefs.hpp"
+#include "query_execution/QueryContext.hpp"
+#include "relational_operators/RelationalOperator.hpp"
+#include "relational_operators/WorkOrder.hpp"
+#include "storage/StorageBlockInfo.hpp"
+#include "threading/SpinMutex.hpp"
+#include "utility/BulkIOConfiguration.hpp"
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+#include "tmb/id_typedefs.h"
+
+namespace tmb { class MessageBus; }
+
+namespace quickstep {
+
+class CatalogRelationSchema;
+class StorageManager;
+class ValueAccessor;
+class WorkOrderProtosContainer;
+class WorkOrdersContainer;
+
+namespace serialization { class WorkOrder; }
+
+/** \addtogroup RelationalOperators
+ *  @{
+ */
+
+class TableExportOperator : public RelationalOperator {
+ public:
+  /**
+   * @brief Feedback message to Foreman when a 
TableExportToStringWorkOrder has
+   *completed writing a block to the string buffer.
+   */
+  enum FeedbackMessageType : WorkOrder::FeedbackMessageType {
+  kBlockOutputMessage,
+  };
+
+  /**
+   * @brief Constructor.
+   *
+   * @param query_id The ID of the query to which this operator belongs.
+   * @param input_relation The relation to export.
+   * @param input_relation_is_stored If input_relation is a stored 
relation and
+   *is fully available to the operator before it can start 
generating
+   *workorders.
+   * @param file_name The name of the file to export the relation to.
+   * @param options The options that specify the detailed format of the 
output
+   *file.
+   */
+  TableExportOperator(const std::size_t query_id,
+  const CatalogRelation _relation,
+  const bool input_relation_is_stored,
+  const std::string _name,
+  const BulkIOConfigurationPtr )
+  : RelationalOperator(query_id),
+input_relation_(input_relation),
+input_relation_is_stored_(input_relation_is_stored),
+file_name_(file_name),
+options_(options),
+input_relation_block_ids_(input_relation_is_stored
+  ? input_relation.getBlocksSnapshot()
+  : std::vector()),
+num_workorders_generated_(0),
+started_(false),
+num_blocks_written_(0),
+file_(nullptr) {}
+
+  ~TableExportOperator() override {}
+
+  OperatorType getOperatorType() const override {
+return kTableExport;
+  }
+
+  std::string getName() const override {
+return "TableExportOperator";
+  }
+
+  /**
+   * @return The relation to export.
+   */
+  const CatalogRelation& input_relation() const {
+return input_relation_;
+  }
+
+  bool getAllWorkOrders(WorkOrdersContainer *container,
+QueryContext *query_context,
+StorageManager *storage_manager,
+const tmb::client_id scheduler_client_id,
+tmb::MessageBus *bus) override;
+
+  bool getAllWorkOrderProtos(WorkOrderProtosContainer *container) override;
+
+  void feedInputBlock(const block_id input_block_id,
+  const relation_id input_relation_id,
+   

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136148367
  
--- Diff: parser/ParseStatement.hpp ---
@@ -60,16 +60,16 @@ class ParseStatement : public ParseTreeNode {
* @brief The possible types of SQL statements.
**/
   enum StatementType {
-kCreateTable,
+kCommand,
--- End diff --

Change to `kCommand = 0,`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136158632
  
--- Diff: query_optimizer/physical/CopyFrom.hpp ---
@@ -68,22 +69,14 @@ class CopyFrom : public Physical {
   const std::string& file_name() const { return file_name_; }
 
   /**
-   * @return The delimiter used in the text file to separate columns.
+   * @return The options for this COPY FROM statement.
*/
-  const char column_delimiter() const { return column_delimiter_; }
-
-  /**
-   * @return Whether to decode escape sequences in the text file.
-   */
-  bool escape_strings() const { return escape_strings_; }
+  BulkIOConfigurationPtr options() const { return options_; }
--- End diff --

Return a const ref.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136156631
  
--- Diff: utility/BulkIOConfiguration.hpp ---
@@ -0,0 +1,198 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_UTILITY_BULK_IO_CONFIGURATION_HPP_
+#define QUICKSTEP_UTILITY_BULK_IO_CONFIGURATION_HPP_
+
+#include 
+#include 
+
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+namespace quickstep {
+
+/**
+ * @brief External file format for bulk I/O.
+ */
+enum class BulkIOFormat {
+  kCSV,
+  kText
+};
+
+class BulkIOConfiguration;
+typedef std::shared_ptr BulkIOConfigurationPtr;
+
+/**
+ * @brief Detailed file format configuration for bulk I/O (i.e. COPY 
operations)
+ *that moves data between Quickstep tables and external files.
+ */
+class BulkIOConfiguration {
+ public:
+  /**
+   * @brief Constructor.
+   *
+   * @param format External file format.
+   */
+  BulkIOConfiguration(const BulkIOFormat format)
+  : format_(format) {
+initializeDefaultParameters(format);
+  }
+
+  /**
+   * @brief Get the external file format.
+   *
+   * @return The external file format.
+   */
+  inline BulkIOFormat getFormat() const {
+return format_;
+  }
+
+  /**
+   * @brief Get the external file format's name.
+   *
+   * @return The external file format's name.
+   */
+  inline std::string getFormatName() const {
+switch (format_) {
+  case BulkIOFormat::kCSV:
+return "CSV";
+  case BulkIOFormat::kText:
+return "TEXT";
+  default:
+break;
+}
+LOG(FATAL) << "Unexpected format in 
BulkIOConfiguration::getFormatName()";
+  }
+
+  /**
+   * @brief Get the delimiter character (which is the character that 
separates
+   *attribute values in external files).
+   *
+   * @return The delimiter character.
+   */
+  inline char getDelimiter() const {
+return delimiter_;
+  }
+
+  /**
+   * @brief Set the delimiter character.
+   *
+   * @param delimiter The delimiter character to set.
+   */
+  inline void setDelimiter(const char delimiter) {
+delimiter_ = delimiter;
+  }
+
+  /**
+   * @brief Check whether to encode/decode between special characters and 
escape
+   *sequences.
+   *
+   * @return Whether to encode/decode between special characters and escape
+   * sequences.
+   */
+  inline bool escapeStrings() const {
--- End diff --

Optionally, change to `escape_strings()` and `set_escape_strings()`, 
according to [code 
style](https://google.github.io/styleguide/cppguide.html#Function_Names).

Same for `delimiter()`, `header()`, `quote()`, and `null_string()`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136161900
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
+  THROW_SQL_ERROR_AT(parse_delimiter)
+  << "DELIMITER is not a single character";
+}
+options->setDelimiter(delimiter.front());
+  } else if (key == "escape_strings") {
+options->setEscapeStrings(GetKeyValueBool(param));
+  } else if (key != "format") {
+THROW_SQL_ERROR_AT() << "Unsupported copy option: " << key;
   }
 }
-escape_strings_ = params->escape_strings;
   }
 
   return 
L::CopyFrom::Create(resolveRelationName(copy_from_statement.relation_name()),
- 
copy_from_statement.source_filename()->value(),
- column_delimiter_[0],
- escape_strings_);
+ copy_from_statement.file_name()->value(),
+ BulkIOConfigurationPtr(options.release()));
+}
+
+L::LogicalPtr Resolver::resolveCopyTo(
+const ParseStatementCopy _to_statement) {
+  DCHECK(copy_to_statement.getCopyDirection() == ParseStatementCopy::kTo);
+  const PtrList *params = copy_to_statement.params();
+
+  // Check if copy format is explicitly specified.
+  BulkIOFormat file_format = BulkIOFormat::kText;
+  bool format_specified = false;
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+if (format == "csv") {
+  file_format = BulkIOFormat::kCSV;
+} else if (format == "text") {
+  file_format = BulkIOFormat::kText;
+} else {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+format_specified = true;
+break;
+  }
+}
+  }
+
+  const std::string _name = copy_to_statement.file_name()->value();
+  if (file_name.length() <= 1) {
+THROW_SQL_ERROR_AT(copy_to_statement.file_name())
+<< "File name can not be empty";
+  }
+
+  // Infer copy format from file name extension.
+  if (!format_specified) {
+if (file_name.length() > 4) {
+  if (ToLower(file_name.substr(file_name.length() - 4)) == ".csv") {
+file_format = BulkIOFormat::kCSV;
+  }
+}
+  }
+
+  // Resolve the copy options.
+  std::unique_ptr options =
+  std::make_unique(file_format);
--- End diff --

Ditto for `std::make_shared`.


---
If your project is set 

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136161515
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
--- End diff --

Remove `&`, as it is a copy indeed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136154272
  
--- Diff: utility/BulkIOConfiguration.hpp ---
@@ -0,0 +1,198 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_UTILITY_BULK_IO_CONFIGURATION_HPP_
+#define QUICKSTEP_UTILITY_BULK_IO_CONFIGURATION_HPP_
+
+#include 
+#include 
+
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+namespace quickstep {
+
+/**
+ * @brief External file format for bulk I/O.
+ */
+enum class BulkIOFormat {
+  kCSV,
+  kText
+};
+
+class BulkIOConfiguration;
+typedef std::shared_ptr BulkIOConfigurationPtr;
+
+/**
+ * @brief Detailed file format configuration for bulk I/O (i.e. COPY 
operations)
+ *that moves data between Quickstep tables and external files.
+ */
+class BulkIOConfiguration {
+ public:
+  /**
+   * @brief Constructor.
+   *
+   * @param format External file format.
+   */
+  BulkIOConfiguration(const BulkIOFormat format)
--- End diff --

Add `explicit`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136153634
  
--- Diff: utility/BulkIOConfiguration.hpp ---
@@ -0,0 +1,198 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ **/
+
+#ifndef QUICKSTEP_UTILITY_BULK_IO_CONFIGURATION_HPP_
+#define QUICKSTEP_UTILITY_BULK_IO_CONFIGURATION_HPP_
+
+#include 
+#include 
+
+#include "utility/Macros.hpp"
+
+#include "glog/logging.h"
+
+namespace quickstep {
+
+/**
+ * @brief External file format for bulk I/O.
+ */
+enum class BulkIOFormat {
+  kCSV,
--- End diff --

Change to `kCSV = 0,`. Personally, prefer to use `kCsv`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread zuyu
Github user zuyu commented on a diff in the pull request:

https://github.com/apache/incubator-quickstep/pull/291#discussion_r136161982
  
--- Diff: query_optimizer/resolver/Resolver.cpp ---
@@ -418,27 +455,157 @@ L::LogicalPtr Resolver::resolve(const ParseStatement 
_query) {
 }
 
 L::LogicalPtr Resolver::resolveCopyFrom(
-const ParseStatementCopyFrom _from_statement) {
-  // Default parameters.
-  std::string column_delimiter_ = "\t";
-  bool escape_strings_ = true;
+const ParseStatementCopy _from_statement) {
+  DCHECK(copy_from_statement.getCopyDirection() == 
ParseStatementCopy::kFrom);
+  const PtrList *params = copy_from_statement.params();
 
-  const ParseCopyFromParams *params = copy_from_statement.params();
+  BulkIOFormat file_format = BulkIOFormat::kText;
   if (params != nullptr) {
-if (params->delimiter != nullptr) {
-  column_delimiter_ = params->delimiter->value();
-  if (column_delimiter_.size() != 1) {
-THROW_SQL_ERROR_AT(params->delimiter)
-<< "DELIMITER is not a single character";
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+// TODO(jianqiao): Support other bulk load formats such as CSV.
+if (format != "text") {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+// Update file_format when other formats get supported.
+break;
+  }
+}
+  }
+
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "delimiter") {
+const ParseString *parse_delimiter = GetKeyValueString(param);
+const std::string  = parse_delimiter->value();
+if (delimiter.size() != 1) {
+  THROW_SQL_ERROR_AT(parse_delimiter)
+  << "DELIMITER is not a single character";
+}
+options->setDelimiter(delimiter.front());
+  } else if (key == "escape_strings") {
+options->setEscapeStrings(GetKeyValueBool(param));
+  } else if (key != "format") {
+THROW_SQL_ERROR_AT() << "Unsupported copy option: " << key;
   }
 }
-escape_strings_ = params->escape_strings;
   }
 
   return 
L::CopyFrom::Create(resolveRelationName(copy_from_statement.relation_name()),
- 
copy_from_statement.source_filename()->value(),
- column_delimiter_[0],
- escape_strings_);
+ copy_from_statement.file_name()->value(),
+ BulkIOConfigurationPtr(options.release()));
+}
+
+L::LogicalPtr Resolver::resolveCopyTo(
+const ParseStatementCopy _to_statement) {
+  DCHECK(copy_to_statement.getCopyDirection() == ParseStatementCopy::kTo);
+  const PtrList *params = copy_to_statement.params();
+
+  // Check if copy format is explicitly specified.
+  BulkIOFormat file_format = BulkIOFormat::kText;
+  bool format_specified = false;
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+  const std::string  = ToLower(param.key()->value());
+  if (key == "format") {
+const ParseString *parse_format = GetKeyValueString(param);
+const std::string format = ToLower(parse_format->value());
+if (format == "csv") {
+  file_format = BulkIOFormat::kCSV;
+} else if (format == "text") {
+  file_format = BulkIOFormat::kText;
+} else {
+  THROW_SQL_ERROR_AT(parse_format) << "Unsupported file format: " 
<< format;
+}
+format_specified = true;
+break;
+  }
+}
+  }
+
+  const std::string _name = copy_to_statement.file_name()->value();
+  if (file_name.length() <= 1) {
+THROW_SQL_ERROR_AT(copy_to_statement.file_name())
+<< "File name can not be empty";
+  }
+
+  // Infer copy format from file name extension.
+  if (!format_specified) {
+if (file_name.length() > 4) {
+  if (ToLower(file_name.substr(file_name.length() - 4)) == ".csv") {
+file_format = BulkIOFormat::kCSV;
+  }
+}
+  }
+
+  // Resolve the copy options.
+  std::unique_ptr options =
+  std::make_unique(file_format);
+  if (params != nullptr) {
+for (const ParseKeyValue  : *params) {
+   

[GitHub] incubator-quickstep pull request #291: Add "COPY TO" operator for exporting ...

2017-08-30 Thread jianqiao
GitHub user jianqiao opened a pull request:

https://github.com/apache/incubator-quickstep/pull/291

Add "COPY TO" operator for exporting data from Quickstep.

This PR adds support for the "COPY TO" statement for exporting tables from 
Quickstep. Two formats `TEXT` and `CSV` are supported.

The current available copy options are:
- `FORMAT`: Format of the output file, either `TEXT` or `CSV`.
- `DELIMITER`: Separator character of the fields.
- `HEADER`: Whether to add table header. For `CSV` format only.
- `QUOTE`: The quote character. For `CSV` format only.
- `ESCAPE_STRINGS`: Whether to escape special characters. For `TEXT` format 
only.
- `NULL_STRING`: The string representation of the `NULL` value.

See the example queries and results 
[here](https://github.com/apache/incubator-quickstep/blob/a036acb446f137fea263ae218ef12f337f5bc1a1/query_optimizer/tests/execution_generator/Copy.test).
 

Note that some convenient features are also provided:
- Export the result table from a query.
```
-- (1) --
COPY
  SELECT x FROM r
TO 'data.txt';

-- (2) --
WITH s(v) AS (
  SELECT MIN(y) FROM r GROUP BY x
)
COPY
  SELECT AVG(v) FROM s
TO 'results.csv';
```

- Print to standard output/error stream, e.g.
```
-- (1) --
COPY r TO stdout;

-- (2) --
COPY 
  SELECT x FROM r
TO stderr;
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/incubator-quickstep copy-to

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-quickstep/pull/291.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #291


commit a036acb446f137fea263ae218ef12f337f5bc1a1
Author: Jianqiao Zhu 
Date:   2017-08-04T21:49:45Z

Add "COPY TO" operator for exporting data from Quickstep.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---