[jira] [Commented] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-09 Thread Francois Saint-Jacques (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011939#comment-17011939
 ] 

Francois Saint-Jacques commented on ARROW-7498:
---

You are right that it doesn't partition anything.

> [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme
> -
>
> Key: ARROW-7498
> URL: https://issues.apache.org/jira/browse/ARROW-7498
> Project: Apache Arrow
>  Issue Type: Wish
>  Components: C++ - Dataset
>Reporter: Francois Saint-Jacques
>Assignee: Francois Saint-Jacques
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> DataFragment -> Fragment
> DataSource -> Source
> PartitionScheme -> PartitionSchema
> *Discovery -> *Manifest



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-09 Thread Joris Van den Bossche (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011893#comment-17011893
 ] 

Joris Van den Bossche commented on ARROW-7498:
--

I personally find "partitioner" sound a bir strange to my (non-native english) 
ears. 
It sounds as the one that takes an action to partition data, which is not the 
case here, it is only describing an existing partitioning of the data?

> [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme
> -
>
> Key: ARROW-7498
> URL: https://issues.apache.org/jira/browse/ARROW-7498
> Project: Apache Arrow
>  Issue Type: Wish
>  Components: C++ - Dataset
>Reporter: Francois Saint-Jacques
>Assignee: Francois Saint-Jacques
>Priority: Major
>
> DataFragment -> Fragment
> DataSource -> Source
> PartitionScheme -> PartitionSchema
> *Discovery -> *Manifest



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-09 Thread Ben Kietzman (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011884#comment-17011884
 ] 

Ben Kietzman commented on ARROW-7498:
-

IMHO OrderedPartitioner is best. DirectoryPartitioner seems misleading since 
all of the partition schemes parse directory names

> [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme
> -
>
> Key: ARROW-7498
> URL: https://issues.apache.org/jira/browse/ARROW-7498
> Project: Apache Arrow
>  Issue Type: Wish
>  Components: C++ - Dataset
>Reporter: Francois Saint-Jacques
>Assignee: Francois Saint-Jacques
>Priority: Major
>
> DataFragment -> Fragment
> DataSource -> Source
> PartitionScheme -> PartitionSchema
> *Discovery -> *Manifest



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-08 Thread Francois Saint-Jacques (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010947#comment-17010947
 ] 

Francois Saint-Jacques commented on ARROW-7498:
---

For SchemaPartitioner (each directory is a partition value), I have

* StackPartitioner
* LevelPartitioner
* HierarchyPartitioner
* OrderedPartitioner

> [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme
> -
>
> Key: ARROW-7498
> URL: https://issues.apache.org/jira/browse/ARROW-7498
> Project: Apache Arrow
>  Issue Type: Wish
>  Components: C++ - Dataset
>Reporter: Francois Saint-Jacques
>Assignee: Francois Saint-Jacques
>Priority: Major
>
> DataFragment -> Fragment
> DataSource -> Source
> PartitionScheme -> PartitionSchema
> *Discovery -> *Manifest



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-08 Thread Ben Kietzman (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010783#comment-17010783
 ] 

Ben Kietzman commented on ARROW-7498:
-

I'd say Partitioner. Partitioning sounds more like it describes the output

> [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme
> -
>
> Key: ARROW-7498
> URL: https://issues.apache.org/jira/browse/ARROW-7498
> Project: Apache Arrow
>  Issue Type: Wish
>  Components: C++ - Dataset
>Reporter: Francois Saint-Jacques
>Assignee: Francois Saint-Jacques
>Priority: Major
>
> DataFragment -> Fragment
> DataSource -> Source
> PartitionScheme -> PartitionSchema
> *Discovery -> *Manifest



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-08 Thread Francois Saint-Jacques (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010773#comment-17010773
 ] 

Francois Saint-Jacques commented on ARROW-7498:
---

Partitioning or Partitioner?

> [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme
> -
>
> Key: ARROW-7498
> URL: https://issues.apache.org/jira/browse/ARROW-7498
> Project: Apache Arrow
>  Issue Type: Wish
>  Components: C++ - Dataset
>Reporter: Francois Saint-Jacques
>Assignee: Francois Saint-Jacques
>Priority: Major
>
> DataFragment -> Fragment
> DataSource -> Source
> PartitionScheme -> PartitionSchema
> *Discovery -> *Manifest



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-07 Thread Krisztian Szucs (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009748#comment-17009748
 ] 

Krisztian Szucs commented on ARROW-7498:


Perhaps call it {{Partitioning}}, {{HivePartitioning}} etc.

> [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme
> -
>
> Key: ARROW-7498
> URL: https://issues.apache.org/jira/browse/ARROW-7498
> Project: Apache Arrow
>  Issue Type: Wish
>  Components: C++ - Dataset
>Reporter: Francois Saint-Jacques
>Assignee: Francois Saint-Jacques
>Priority: Major
>
> DataFragment -> Fragment
> DataSource -> Source
> PartitionScheme -> PartitionSchema
> *Discovery -> *Manifest



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-7498) [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme

2020-01-06 Thread Ben Kietzman (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009056#comment-17009056
 ] 

Ben Kietzman commented on ARROW-7498:
-

+1. Drilling into PartitionSchema: SchemaPartitionScheme -> 
OrderedPartitionSchema ?

> [C++][Dataset] Rename DataFragment/DataSource/PartitionScheme
> -
>
> Key: ARROW-7498
> URL: https://issues.apache.org/jira/browse/ARROW-7498
> Project: Apache Arrow
>  Issue Type: Wish
>  Components: C++ - Dataset
>Reporter: Francois Saint-Jacques
>Priority: Major
>
> DataFragment -> Fragment
> DataSource -> Source
> PartitionScheme -> PartitionSchema
> *Discovery -> *Manifest



--
This message was sent by Atlassian Jira
(v8.3.4#803005)