[4/4] carbondata-site git commit: Fixed the Failed to load PDF issue

chenliang613 Thu, 10 Aug 2017 20:40:45 -0700

Fixed the Failed to load PDF issue


Project: http://git-wip-us.apache.org/repos/asf/carbondata-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata-site/commit/167fcae3
Tree: http://git-wip-us.apache.org/repos/asf/carbondata-site/tree/167fcae3
Diff: http://git-wip-us.apache.org/repos/asf/carbondata-site/diff/167fcae3

Branch: refs/heads/asf-site
Commit: 167fcae375c2fb7f4e11255968da932456dab505
Parents: edabb2a
Author: chenliang613 <chenliang...@apache.org>
Authored: Fri Aug 11 11:39:43 2017 +0800
Committer: chenliang613 <chenliang...@apache.org>
Committed: Fri Aug 11 11:39:43 2017 +0800

----------------------------------------------------------------------
 content/configuration-parameters.html    |  24 ++++++++
 content/ddl-operation-on-carbondata.html |  25 ++++++++-
 content/dml-operation-on-carbondata.html |  78 ++++++++++++++++++++++++--
 content/pdf/maven-pdf-plugin.pdf         | Bin 155540 -> 233933 bytes
 4 files changed, 121 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/167fcae3/content/configuration-parameters.html
----------------------------------------------------------------------
diff --git a/content/configuration-parameters.html 
b/content/configuration-parameters.html
index 96a91c6..d3c96ea 100644
--- a/content/configuration-parameters.html
+++ b/content/configuration-parameters.html
@@ -292,6 +292,30 @@
 <td>The Number of partitions to use when shuffling data for sort. If user 
don't configurate or configurate it less than 1, it uses the number of map 
tasks as reduce tasks. In general, we recommend 2-3 tasks per CPU core in your 
cluster.</td>
 <td></td>
 </tr>
+<tr>
+<td>carbon.options.bad.records.logger.enable</td>
+<td>false</td>
+<td>Whether to create logs with details about bad records.</td>
+<td></td>
+</tr>
+<tr>
+<td>carbon.bad.records.action</td>
+<td>fail</td>
+<td>This property can have four types of actions for bad records FORCE, 
REDIRECT, IGNORE and FAIL. If set to FORCE then it auto-corrects the data by 
storing the bad records as NULL. If set to REDIRECT then bad records are 
written to the raw CSV instead of being loaded. If set to IGNORE then bad 
records are neither loaded nor written to the raw CSV. If set to FAIL then data 
loading fails if any bad records are found.</td>
+<td></td>
+</tr>
+<tr>
+<td>carbon.options.is.empty.data.bad.record</td>
+<td>false</td>
+<td>If false, then empty ("" or '' or ,,) data will not be considered as bad 
record and vice versa.</td>
+<td></td>
+</tr>
+<tr>
+<td>carbon.options.bad.record.path</td>
+<td></td>
+<td>Specifies the HDFS path where bad records are stored. By default the value 
is Null. This path must to be configured by the user if bad record logger is 
enabled or bad record action redirect.</td>
+<td></td>
+</tr>
 </tbody>
 </table>
 <ul>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/167fcae3/content/ddl-operation-on-carbondata.html
----------------------------------------------------------------------
diff --git a/content/ddl-operation-on-carbondata.html 
b/content/ddl-operation-on-carbondata.html
index 5c0cd9e..19cb64d 100644
--- a/content/ddl-operation-on-carbondata.html
+++ b/content/ddl-operation-on-carbondata.html
@@ -275,7 +275,13 @@ By default inverted index is enabled. The user can disable 
the inverted index cr
 <li>
 <p>All dimensions except complex datatype columns are part of multi 
dimensional key(MDK). This behavior can be overridden by using TBLPROPERTIES. 
If the user wants to keep any column (except columns of complex datatype) in 
multi dimensional key then he can keep the columns either in DICTIONARY_EXCLUDE 
or DICTIONARY_INCLUDE.</p>
 </li>
+<li>
+<p><strong>Sort Columns Configuration</strong></p>
+<p>"SORT_COLUMN" property is for users to specify which columns belong to the 
MDK index. If user don't specify "SORT_COLUMN" property, by default MDK index 
be built by using all dimension columns except complex datatype column.</p>
+</li>
 </ul>
+<pre><code>       TBLPROPERTIES ('SORT_COLUMNS'='column1, column3')
+</code></pre>
 <h3>
 <a id="example" class="anchor" href="#example" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Example:</h3>
 <pre><code>    CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
@@ -290,8 +296,25 @@ By default inverted index is enabled. The user can disable 
the inverted index cr
       STORED BY 'carbondata'
       TBLPROPERTIES ('DICTIONARY_EXCLUDE'='storeCity',
                      'DICTIONARY_INCLUDE'='productNumber',
-                     'NO_INVERTED_INDEX'='productBatch')
+                     'NO_INVERTED_INDEX'='productBatch',
+                     'SORT_COLUMNS'='productName,storeCity')
+</code></pre>
+<ul>
+<li><strong>SORT_COLUMNS</strong></li>
+</ul>
+<pre><code>This table property specifies the order of the sort column.
+</code></pre>
+<pre><code>    TBLPROPERTIES('SORT_COLUMNS'='column1, column3')
 </code></pre>
+<p>NOTE:</p>
+<ul>
+<li>
+<p>If this property is not specified, then by default SORT_COLUMNS consist of 
all dimension (exclude Complex Column).</p>
+</li>
+<li>
+<p>If this property is specified but with empty argument, then the table will 
be loaded without sort. For example, ('SORT_COLUMNS'='')</p>
+</li>
+</ul>
 <h2>
 <a id="show-table" class="anchor" href="#show-table" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>SHOW TABLE</h2>
 <p>This command can be used to list all the tables in current database or all 
the tables of a specific database.</p>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/167fcae3/content/dml-operation-on-carbondata.html
----------------------------------------------------------------------
diff --git a/content/dml-operation-on-carbondata.html 
b/content/dml-operation-on-carbondata.html
index d187004..6e27c75 100644
--- a/content/dml-operation-on-carbondata.html
+++ b/content/dml-operation-on-carbondata.html
@@ -304,10 +304,10 @@ column2:dictionaryFilePath2')
 <p>If this option is set to TRUE, then high.cardinality.identify.enable 
property will be disabled during data load.</p>
 </li>
 </ul>
-</li>
-</ul>
 <h3>
 <a id="example" class="anchor" href="#example" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Example:</h3>
+</li>
+</ul>
 <pre><code>LOAD DATA local inpath '/opt/rawdata/data.csv' INTO table 
carbontable
 options('DELIMITER'=',', 'QUOTECHAR'='"','COMMENTCHAR'='#',
 'FILEHEADER'='empno,empname,designation,doj,workgroupcategory,
@@ -319,6 +319,74 @@ options('DELIMITER'=',', 'QUOTECHAR'='"','COMMENTCHAR'='#',
 'SINGLE_PASS'='TRUE'
 )
 </code></pre>
+<ul>
+<li>
+<p><strong>BAD RECORDS HANDLING:</strong> Methods of handling bad records are 
as follows:</p>
+<ul>
+<li>
+<p>Load all of the data before dealing with the errors.</p>
+</li>
+<li>
+<p>Clean or delete bad records before loading data or stop the loading when 
bad records are found.</p>
+</li>
+</ul>
+<pre><code>OPTIONS('BAD_RECORDS_LOGGER_ENABLE'='true', 
'BAD_RECORD_PATH'='hdfs://hacluster/tmp/carbon', 
'BAD_RECORDS_ACTION'='REDIRECT', 'IS_EMPTY_DATA_BAD_RECORD'='false')
+</code></pre>
+<p>NOTE:</p>
+<ul>
+<li>
+<p>If the REDIRECT option is used, Carbon will add all bad records in to a 
separate CSV file. However, this file must not be used for subsequent data 
loading because the content may not exactly match the source record. You are 
advised to cleanse the original source record for further data ingestion. This 
option is used to remind you which records are bad records.</p>
+</li>
+<li>
+<p>In loaded data, if all records are bad records, the BAD_RECORDS_ACTION is 
invalid and the load operation fails.</p>
+</li>
+<li>
+<p>The maximum number of characters per column is 100000. If there are more 
than 100000 characters in a column, data loading will fail.</p>
+</li>
+</ul>
+</li>
+</ul>
+<h3>
+<a id="example-1" class="anchor" href="#example-1" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Example:</h3>
+<pre><code>LOAD DATA INPATH 'filepath.csv'
+INTO TABLE tablename
+OPTIONS('BAD_RECORDS_LOGGER_ENABLE'='true',
+'BAD_RECORD_PATH'='hdfs://hacluster/tmp/carbon',
+'BAD_RECORDS_ACTION'='REDIRECT',
+'IS_EMPTY_DATA_BAD_RECORD'='false');
+</code></pre>
+<p><strong>Bad Records Management Options:</strong></p>
+<table>
+<thead>
+<tr>
+<th>Options</th>
+<th>Default Value</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>BAD_RECORDS_LOGGER_ENABLE</td>
+<td>false</td>
+<td>Whether to create logs with details about bad records.</td>
+</tr>
+<tr>
+<td>BAD_RECORDS_ACTION</td>
+<td>FAIL</td>
+<td>Following are the four types of action for bad records:  FORCE: 
Auto-corrects the data by storing the bad records as NULL.  REDIRECT: Bad 
records are written to the raw CSV instead of being loaded.  IGNORE: Bad 
records are neither loaded nor written to the raw CSV.  FAIL: Data loading 
fails if any bad records are found.  NOTE: In loaded data, if all records are 
bad records, the BAD_RECORDS_ACTION is invalid and the load operation 
fails.</td>
+</tr>
+<tr>
+<td>IS_EMPTY_DATA_BAD_RECORD</td>
+<td>false</td>
+<td>If false, then empty ("" or '' or ,,) data will not be considered as bad 
record and vice versa.</td>
+</tr>
+<tr>
+<td>BAD_RECORD_PATH</td>
+<td>-</td>
+<td>Specifies the HDFS path where bad records are stored. By default the value 
is Null. This path must to be configured by the user if bad record logger is 
enabled or bad record action redirect.</td>
+</tr>
+</tbody>
+</table>
 <h2>
 <a id="insert-data-into-a-carbondata-table" class="anchor" 
href="#insert-data-into-a-carbondata-table" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>INSERT DATA INTO A 
CARBONDATA TABLE</h2>
 <p>This command inserts data into a CarbonData table. It is defined as a 
combination of two queries Insert and Select query respectively. It inserts 
records from a source table into a target CarbonData table. The source table 
can be a Hive table, Parquet table or a CarbonData table itself. It comes with 
the functionality to aggregate the records of a table by performing Select 
query on source table and load its corresponding resultant records into a 
CarbonData table.</p>
@@ -416,7 +484,7 @@ LIMIT number_of_segments;
 </tbody>
 </table>
 <h3>
-<a id="example-1" class="anchor" href="#example-1" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Example:</h3>
+<a id="example-2" class="anchor" href="#example-2" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Example:</h3>
 <pre><code>SHOW SEGMENTS FOR TABLE CarbonDatabase.CarbonTable LIMIT 4;
 </code></pre>
 <h2>
@@ -458,7 +526,7 @@ Using this segment ID, you can remove the segment.</p>
 </tbody>
 </table>
 <h3>
-<a id="example-2" class="anchor" href="#example-2" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Example:</h3>
+<a id="example-3" class="anchor" href="#example-3" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Example:</h3>
 <pre><code>DELETE FROM TABLE CarbonDatabase.CarbonTable WHERE SEGMENT.ID IN 
(0);
 DELETE FROM TABLE CarbonDatabase.CarbonTable WHERE SEGMENT.ID IN (0,5,8);
 </code></pre>
@@ -499,7 +567,7 @@ WHERE SEGMENT.STARTTIME BEFORE DATE_VALUE
 </tbody>
 </table>
 <h3>
-<a id="example-3" class="anchor" href="#example-3" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Example:</h3>
+<a id="example-4" class="anchor" href="#example-4" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Example:</h3>
 <pre><code> DELETE FROM TABLE CarbonDatabase.CarbonTable 
  WHERE SEGMENT.STARTTIME BEFORE '2017-06-01 12:05:06';  
 </code></pre>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/167fcae3/content/pdf/maven-pdf-plugin.pdf
----------------------------------------------------------------------
diff --git a/content/pdf/maven-pdf-plugin.pdf b/content/pdf/maven-pdf-plugin.pdf
index cb6de01..37389f8 100644
Binary files a/content/pdf/maven-pdf-plugin.pdf and 
b/content/pdf/maven-pdf-plugin.pdf differ

[4/4] carbondata-site git commit: Fixed the Failed to load PDF issue

Reply via email to