100pah edited a comment on pull request #13358:
URL: 
https://github.com/apache/incubator-echarts/pull/13358#issuecomment-704408035


   # How a user expresses data mapping for transition?
   
   
   ## Issues
   
   First and foremost, we need to consider those issues below:
   
   ### ISSUE_I: If we need to "auto detect the change of dimensions" between 
old data and new data, how to implement it?
   We should consider:
   + We have never been forcing users to specify dimension names. User can only 
specify certain dimensions by dimension index, which is probably convenient in 
some scenario in practice.
   + If we implement "data mapping for transition animation" via "auto 
detection of the change of dimensions", probably we can force the users to 
specify dimension names if they want to have a "correct transition animation", 
and perform mapping by the rule of `MAPPING_ON_THE_SAME_DIMENSION_NAME`, which 
means that if there is any equality on `oldData.dimensions[i].name` and 
`newData.dimensions[j].name`, we can perform mapping of data items by the 
values on `oldData.dimensions[i]` and `newData.dimensions[j]`. **Is there any 
flaw if applying that rule**?
   
   
   ### ISSUE_II: The issues of "mapping by index":
   The default data mapping implementation is provided by `List['diff']`, where 
if the names of data items are not specified, they will be mapped by data 
index. "mapping by index" is not a big deal in scenarios that the meaning of 
transition are not noticed. But in some scenario that the meaning of transition 
need to be noticed, like storytelling, any incorrectly data mapping is probably 
inappropriate. For example:
   
   `dataA` is the raw data, where the dimensions are `['Year', 'Income', 
'Population', 'Sex', 'Country']`.
   `dataB` is calculated by:
   ```sql
   select avg(`Population`), avg(`Income`) from `dataA` group by `Sex`;
   ```
   `dataC` is calculated by:
   ```sql
   select avg(`Population`), avg(`Income`) from `dataA` group by `Country`;
   ```
   Suppose there are only two values in dimension `Country` (`'France'`, 
`'Germany'`), which are just the same as the value count of dimension `Sex` 
(`'Woman'`, `'Man'`).
   Consequently the count of `dataB` and `dataC` are exactly the same.
   Having these data above, when `dataB` is switched to `dataC` via 
`setOption`, the data mapping should not be performed by index. Otherwise there 
will be misleading mappings from `'Man'` to `'France'` or from `'Women'` to 
`'Germany'`. In this case, no transition animation is probably better than 
misleading transition animation.
   
   
   ### ISSUE_III: The issues of "when dimensions not changed":
   Suppose there is no changes before and after `setOption` called:
   Dimensions of `dataA` is `['Income', 'Population', 'Country']`,
   Dimensions of `dataB` is `['Income', 'Population', 'Country']`, exactly the 
same.
   But `dataB` is calculated by:
   ```sql
   select sum(`Income`), avg(`Population`) from `dataA` group by `Country`;
   ```
   Have these data above, the dimensions are not change, but obviously it 
should be mapped neither by index, nor by the first same dimension (`Income`). 
The appropriate mapping should be performed on dimension `Country`, which, 
nevertheless, can not be auto-detected.
   
   That is, even though the dimensions are not changed, it hardly auto-detect 
how to make a totally correct data mapping. User input about transition is 
still needed in this case.
   
   
   ### ISSUE_IV: Issues about "user specifies a dimension (also say, `key` 
below) to perform mapping":
   Suppose there are requirements:
   1. `dataB`(`seriesB`)  ---transition1(on `'Country'`)--->  `dataA`(`seriesA`)
   2. `dataC`(`seriesC`)  ---transition2(on `'Income'`)--->  `dataA`(`seriesA`)
   
   We say the data before the "transition arrow" as `from`, and the data after 
the arrow as `to`.
   `transition1` needs user to input a key `'Country'`, and `transition2` needs 
user to input a key `'Income'`.
   That is, the "user specified key" is not only related to `to` but also 
related to `from`.
   That is, the "user specified key" only work for this calling of `setOption`, 
and should be discarded after setOption called.
   That is, the "user specified key" should better be set on the params of 
`setOption` rather than series option.
   
   If we intend to make the "user specified key" on series option, probably we 
need to lift the concept of that "key", making it not describe something about 
this transition but describe something about the feature of the data itself. 
For example, describe that which dimension is the unique key of the data, and 
echarts can subsequently make an auto-mapping rule based on that unique key. We 
will discuss it below in detail.
   
   
   ### ISSUE_V: Issues about "data totally not changed but need transition 
animation".
   For example, users need transition that from bar to pie chart with the same 
data.
   Suppose there is `dataA`, which dimensions are `['Income', 'Population', 
'Country', 'Sex']` and no dimension is suitable for a unique key. It needs 
echarts to be able to perform data mapping.
   Obviously, the current default mapping rule (List['diff']) that mapping by 
index can handle that.
   But if we disable the rule that "mapping by index" for some other unexpected 
transition animation scenarios, how to handle it instead?
   A possible solution can be:
   ```js
   option = {
       dataset: [{
           dimensions: ['Income', 'Population', 'Country', 'Sex'],
           source: dataA
       }, {
           // Generate an extra dimension as id.
           transform: {
               type: 'id',
               dimensionIndex: 4,
               dimensionName: 'Id'
           }
       }],
       series: {
           type: 'custom',
           encode: { itemName: 'Id' },
           datasetIndex: 1
       }
   };
   ```
   
   <br>
   
   ## Solutions
   
   Based on the scenarios listed above, I summarized to two designs about **how 
a user expresses data mapping for transition**.
   
   
   ### SOLUTION_A: Dimension key about data mapping is set in the parameter of 
`setOption`.
   That is, user is responsible for the setting of "from dimension" and "to 
dimension" of data mapping when intending to have transition animation.
   
   The advantages:
   + The API is more "atomic" relatively. Users can control everything about 
transition, which might avoid some bad cases that haven't thought of.
   + It's not hard for users to configure it in the "linear scene changing" 
(that is, optionA -> optionB -> optionC, be a linked list rather than a 
directed graph).
   
   The disadvantages:
   + It's not easy for users to configure it in the "directed-graph scene 
changing", where users might need upper layer to manage transition settings.
   
   
   
   ### SOLUTION_B: The "key" about transition is set in series option.
   
   The key points of this strategy:
   + Apply `MAPPING_ON_THE_SAME_DIMENSION_NAME`.
   + User is responsible for specifying the "unique key" of data, which is used 
subsequently to select the transition key.
       + The term "unique key" follows the same concept of unique key in 
database.
       + `PENDING_I`: how to specify unique key?
           + We can use the existing setting `series.encode.itemId` to specify 
the unique key, whose only different from `series.encode.itemName` is that it 
will not be displayed in the default tooltip.
   
   Considering the compatibility with the current mapping strategy, when 
`setOption` happen, we have the rule as follows:
   + Get `UNIQUE_KEY_DIMENSION_NAME`: if `series.encode.itemId` is specified 
and has its dimension name specified, we have `UNIQUE_KEY_DIMENSION_NAME`.
   + If there is `newData`.`UNIQUE_KEY_DIMENSION_NAME`, check it in `oldData`. 
If there is any dimension having the same name, we got the transition mapping 
dimension `from` and `to`.
   + Else if there is `oldData`.`UNIQUE_KEY_DIMENSION_NAME`, check it in 
`newData`. If there is any dimension having the same name, we got the 
transition mapping dimension `from` and `to`.
   + Else if there is `newData`.`UNIQUE_KEY_DIMENSION_NAME`, do not apply 
transition animation.
       + This is to provide a way to disable unexpected transition.
   + Else apply the existing mapping rule (`List['diff']`).
   
   
   **User usage hints of SOLUTION_B:**
   
   Scenario in (ISSUE_II):
   Expect no transition animation.
   ```js
   chart.setOption({
       series: {
           encode: { itemId: 'Sex' },
           dimensions: ['Population', 'Income'],
           data: dataB_aggregate_by_Sex_from_dataA
       }
   });
   chart.setOption({
       series: {
           encode: { itemId: 'Country' },
           dimensions: ['Population', 'Income'],
           data: dataC_aggregate_by_Country_from_dataA
       }
   });
   ```
   
   Scenario in (ISSUE_III):
   Expect map by dimension country.
   ```js
   chart.setOption({
       series: {
           encode: { itemId: -1 }, // Means no item name.
           dimensions: ['Income', 'Population', 'Country'],
           data: dataA
       }
   });
   chart.setOption({
       series: {
           encode: { itemId: 'Country' },
           dimensions: ['Income', 'Population', 'Country'],
           data: dataB_aggregate_by_Country_from_dataA
       }
   });
   ```
   
   Scenario in (ISSUE_IV):
   ```js
   chart.setOption({
       series: {
           encode: { itemId: -1 }, // Means no item name.
           dimensions: ['Income', 'Population', 'Country', 'Sex'],
           data: dataA
       }
   });
   chart.setOption({
       series: {
           encode: { itemId: 'Country' },
           dimensions: ['Income', 'Population', 'Country'],
           data: dataB_aggregate_by_Country_from_dataA
       }
   });
   chart.setOption({
       series: {
           encode: { itemId: 'Sex' },
           dimensions: ['Income', 'Population', 'Sex'],
           data: dataC_aggregate_by_Sex_from_dataA
       }
   });
   ```
   
   Scenario in (ISSUE_V):
   Expect transition between bar and pie with the same data.
   ```js
   chart.setOption({
       dataset: [{
           dimensions: ['Income', 'Population', 'Country', 'Sex'],
           source: dataA
       }, {
           // Generate an extra dimension as id.
           transform: {
               type: 'id',
               dimensionIndex: 4,
               dimensionName: 'Id'
           }
       },
   }, { lazyUpdate: true });
   
   chart.setOption({
       series: {
           // render pie
           type: 'custom',
           renderItem: renderBar,
           encode: { itemId: 'Id' },
           datasetIndex: 1
       }
   });
   chart.setOption({
       series: {
           // render bar
           type: 'custom',
           renderItem: renderPie,
           encode: { itemId: 'Id' },
           datasetIndex: 1
       }
   });
   ```
   
   <br>
   
   
   ## Summary
   
   At present I think `SOLUTION_B` probably better.
   But it might reduce the capability then `SOLUTION_A`. I am not sure is there 
any meaningful scenario that `SOLUTION_B` does not cover?
   
   
   What's your opinions @pissang ?
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@echarts.apache.org
For additional commands, e-mail: commits-h...@echarts.apache.org

Reply via email to