Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2632#discussion_r216156187
--- Diff: docs/datamap/lucene-datamap-guide.md ---
@@ -70,42 +66,38 @@ It will show all DataMaps created on main table.
USING 'lucene'
DMPROPERTIES ('INDEX_COLUMNS' = 'name, country',)
```
-
-**DMProperties**
-1. INDEX_COLUMNS: The list of string columns on which lucene creates
indexes.
-2. FLUSH_CACHE: size of the cache to maintain in Lucene writer, if
specified then it tries to
- aggregate the unique data till the cache limit and flush to Lucene. It
is best suitable for low
- cardinality dimensions.
-3. SPLIT_BLOCKLET: when made as true then store the data in blocklet wise
in lucene , it means new
- folder will be created for each blocklet, thus, it eliminates storing
blockletid in lucene and
- also it makes lucene small chunks of data.
+**Properties for Lucene DataMap**
+
+| Property | Is Required | Default Value | Description |
+|-------------|----------|--------|---------|
+| INDEX_COLUMNS | YES | | Carbondata will generate Lucene index on these
string columns. |
+| FLUSH_CACHE | NO | -1 | It defines the size of the cache to maintain in
Lucene writer. If specified, it tries to aggregate the unique data till the
cache limit and then flushes to Lucene. It is recommended to define FLUSH_CACHE
for low cardinality dimensions.|
+| SPLIT_BLOCKLET | NO | TRUE | When SPLIT_BLOCKLET is defined as "TRUE",
folders are created per blocklet by using the blockletID. This eliminates
indexing blockletID by lucene by storing only pageID and rowID, thus reducing
the size of indexes created by lucene. |
+
+**Folder Structure for lucene datamap:**
+ * Location of index files when Split BlockletId is TRUE:
+
+ tablePath/dataMapName/SegmentID/blockName/blockletID/..
+
+ * Location of index files when Split BlockletId is FALSE:
+
+ tablePath/dataMapName/SegmentID/blockName/..
## Loading data
-When loading data to main table, lucene index files will be generated for
all the
-index_columns(String Columns) given in DMProperties which contains
information about the data
-location of index_columns. These index files will be written inside a
folder named with datamap name
-inside each segment folders.
+When loading data to main table, lucene index files will be generated for
all the index_columns(String Columns) given in DMProperties which contains
information about the data location of index_columns. These index files will be
written into the path mentioned above.
--- End diff --
for all the index_columns(String Columns)
---
I think there is no need to mention 'String Columns' again since it is
already mentioned in DMProperties
---