Re: Suggestions for new TsFile

2020-02-11 Thread atoiLiu
Hi,

Thank you for your reply. 
I am very happy that you can take my suggestion.


Thanks

Dawei Liu


> 2020年2月11日 下午6:04,Haonan Hou  写道:
> 
> Hi Dawei,
> 
> Thank you so much that you share your opinion about new TsFile! 
> I am very happy to take your suggestions.
> 
> You said we can remove TsOffsetArray and directly store the offset of 
> TimeseriesMetaData. I agree with you. It is better than my version. 
> Besides, for the optimization of TimeserieMetaData, I would like to discuss 
> with other people to determine which way is better.
> 
> Best,
> 
> Haonan Hou
> 
> 
>> On Feb 11, 2020, at 5:35 PM, atoiLiu  wrote:
>> 
>> Hi,
>> 
>> I’m learning new TsFile in PR [1], but I think TsFileMetaData has a bad 
>> design.
>> 
>> TsFileMetaData has a TsOffsetArray,  TsOffsetArray is record every offset of 
>> TimeseriesMetaData, and use Map to record startIndex , 
>> endIndex of TsOffsetArray, it’s looks like :
>> 
>> TsFileMetaData —>{ [0,1,2,3,4,5, ….] [ {deviceId(d0), [0,2] }, 
>> {deviceId(d1), [3,5] }, …. } }
>> 
>> We can delete TsOffsetArray  and store the offsets directly in the 
>> deviceIndexArray, then TsFileMatadata will has a Map> 
>> to record . This change will save 4 bytes per device on disk, because every 
>> device just need record the number of offsets and offsets. it’s looks like:
>> 
>> TsFileMetaData ---> [ {deviceId(d0), [0,1,2] }, {deviceId(d1), [3,4,5] }, … }
>> 
>> 
>> In addition, TimeSeriesMetaData is an ordered structure on the hard disk, 
>> and the TimeSeriesMetaData for each device is linked together, so 
>> TsFileMetaData does not need to store all offset information, so there two 
>> optimization directions:
>> 
>> 1. Save startTime , endTime and offset for each TimeSeriesMetaData in 
>> TsFileMetaData. The nice thing about this is that when you read 
>> TsFileMetaData from your hard drive, you can directly do a filter to filter 
>> which TimeSeriesMetaData is not necessary to read.
>> 
>> 
>> 2. Only save the start TimeSeriesMetaData offset in TsFileMetaData so that 
>> you can loop through it and just need once to seek, it’s looks like :
>> 
>> TsFileMetaData ---> [ {deviceId(d0), 0 }, {deviceId(d1), 3 }, … }
>> 
>> 
>> 
>> [1] https://github.com/apache/incubator-iotdb/pull/736 
>> <https://github.com/apache/incubator-iotdb/pull/736>
>> 
>> Thanks
>> 
>> Dawei Liu
> 



Suggestions for new TsFile

2020-02-11 Thread atoiLiu
Hi,

I’m learning new TsFile in PR [1], but I think TsFileMetaData has a bad design.

TsFileMetaData has a TsOffsetArray,  TsOffsetArray is record every offset of 
TimeseriesMetaData, and use Map to record startIndex , 
endIndex of TsOffsetArray, it’s looks like :

TsFileMetaData —>{ [0,1,2,3,4,5, ….] [ {deviceId(d0), [0,2] }, {deviceId(d1), 
[3,5] }, …. } }

We can delete TsOffsetArray  and store the offsets directly in the 
deviceIndexArray, then TsFileMatadata will has a Map> to 
record . This change will save 4 bytes per device on disk, because every device 
just need record the number of offsets and offsets. it’s looks like:

TsFileMetaData ---> [ {deviceId(d0), [0,1,2] }, {deviceId(d1), [3,4,5] }, … }


In addition, TimeSeriesMetaData is an ordered structure on the hard disk, and 
the TimeSeriesMetaData for each device is linked together, so TsFileMetaData 
does not need to store all offset information, so there two optimization 
directions:

1. Save startTime , endTime and offset for each TimeSeriesMetaData in 
TsFileMetaData. The nice thing about this is that when you read TsFileMetaData 
from your hard drive, you can directly do a filter to filter which 
TimeSeriesMetaData is not necessary to read.


2. Only save the start TimeSeriesMetaData offset in TsFileMetaData so that you 
can loop through it and just need once to seek, it’s looks like :

TsFileMetaData ---> [ {deviceId(d0), 0 }, {deviceId(d1), 3 }, … }



[1] https://github.com/apache/incubator-iotdb/pull/736 


Thanks

Dawei Liu

Re: About changing Github description topics

2020-02-10 Thread atoiLiu
Hi,

Infra said it was a self-serve service and provided a link to the explanatory 
document [1]. 

I have submitted the pr [2] and hope you can help me review it

[1] 
https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories
 
<https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories>

[2] https://github.com/apache/incubator-iotdb/pull/789 
<https://github.com/apache/incubator-iotdb/pull/789>

Thanks

Dawei Liu

> 2020年2月10日 下午2:54,Xiangdong Huang  写道:
> 
> Hi,
> 
> Of course, you can.
> 
> You can open a new ticket, choose the Project as INFRA, the Issue Type as
> Wish, the second Project as Incubator, and then describe your wish.
> 
> Do not remember to copy the link of this discussion in your description.
> 
> Best,
> 
> ---
> Xiangdong Huang
> School of Software, Tsinghua University
> 
> 黄向东
> 清华大学 软件学院
> 
> 
> atoiLiu  于2020年2月9日周日 下午6:53写道:
> 
>> Hi,
>> Thank you very much for your reply, here is our summary so far:
>> 
>> TimeSeries, TSDB, database, IoT, NoSQL, big-data, Java
>> 
>> Can I directly send email to infra, or send it to someone with authority
>> to modify?
>> 
>> 
>> Thanks
>> Dawei Liu
>> 
>>> 2020年2月9日 下午4:49,Jialin Qiao  写道:
>>> 
>>> Hi,
>>> 
>>> +1 for TimeSeries, TSDB, database, IoT, NoSQL
>>> 
>>> Thanks,
>>> —
>>> Jialin Qiao
>>> School of Software, Tsinghua University
>>> 
>>> 乔嘉林
>>> 清华大学 软件学院
>>> 
>>> 
>>> jincheng sun  于2020年2月9日周日 上午9:18写道:
>>> 
>>>> Thanks for bring up this discussion!
>>>> 
>>>> +1 for add more git hub topics for IoTDB.
>>>> 
>>>> I think the examples which Xiangdong mentioned is pretty good. In
>> addition,
>>>> we can also add language information, such as: Python, Java, Scala
>>>> 
>>>> Best,
>>>> Jincheng
>>>> 
>>>> 
>>>> Xiangdong Huang  于2020年2月8日周六 下午9:48写道:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> +1.  Let's discuss what topics we need to add.
>>>>> 
>>>>> E.g., TimeSeries, TSDB,TimeSeriesDatabase, database, IoT, NoSQL, etc..
>>>>> 
>>>>> Best,
>>>>> ---
>>>>> Xiangdong Huang
>>>>> School of Software, Tsinghua University
>>>>> 
>>>>> 黄向东
>>>>> 清华大学 软件学院
>>>>> 
>>>>> 
>>>>> atoiLiu  于2020年2月8日周六 下午8:20写道:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I noticed that IotDB's github topics were not set, which would greatly
>>>>>> affect the search.
>>>>>> 
>>>>>> If I'm a person looking for a management solution for time series
>> data,
>>>>>> when I search for timeseries on github, IoTDB will not recommended as
>>>>>> relevant content to users.
>>>>>> 
>>>>>> I've looked at other apache projects, and they all have Settings, like
>>>>>> Flink, and what they set is:
>>>>>> Scala Java big-data flink
>>>>>> 
>>>>>> but we only set up IoTDB.
>>>>>> 
>>>>>> Therefore, I suggest adding the following items:
>>>>>> 
>>>>>> database timeseries iot iov big-data Java
>>>>>> 
>>>>>> Any other Suggestions?
>>>>> 
>> 
>> 



Re: About changing Github description topics

2020-02-09 Thread atoiLiu
Hi,
Thank you very much for your reply, here is our summary so far:

TimeSeries, TSDB, database, IoT, NoSQL, big-data, Java

Can I directly send email to infra, or send it to someone with authority to 
modify?


Thanks
Dawei Liu

> 2020年2月9日 下午4:49,Jialin Qiao  写道:
> 
> Hi,
> 
> +1 for TimeSeries, TSDB, database, IoT, NoSQL
> 
> Thanks,
> —
> Jialin Qiao
> School of Software, Tsinghua University
> 
> 乔嘉林
> 清华大学 软件学院
> 
> 
> jincheng sun  于2020年2月9日周日 上午9:18写道:
> 
>> Thanks for bring up this discussion!
>> 
>> +1 for add more git hub topics for IoTDB.
>> 
>> I think the examples which Xiangdong mentioned is pretty good. In addition,
>> we can also add language information, such as: Python, Java, Scala
>> 
>> Best,
>> Jincheng
>> 
>> 
>> Xiangdong Huang  于2020年2月8日周六 下午9:48写道:
>> 
>>> Hi,
>>> 
>>> +1.  Let's discuss what topics we need to add.
>>> 
>>> E.g., TimeSeries, TSDB,TimeSeriesDatabase, database, IoT, NoSQL, etc..
>>> 
>>> Best,
>>> ---
>>> Xiangdong Huang
>>> School of Software, Tsinghua University
>>> 
>>> 黄向东
>>> 清华大学 软件学院
>>> 
>>> 
>>> atoiLiu  于2020年2月8日周六 下午8:20写道:
>>> 
>>>> Hi,
>>>> 
>>>> I noticed that IotDB's github topics were not set, which would greatly
>>>> affect the search.
>>>> 
>>>> If I'm a person looking for a management solution for time series data,
>>>> when I search for timeseries on github, IoTDB will not recommended as
>>>> relevant content to users.
>>>> 
>>>> I've looked at other apache projects, and they all have Settings, like
>>>> Flink, and what they set is:
>>>> Scala Java big-data flink
>>>> 
>>>> but we only set up IoTDB.
>>>> 
>>>> Therefore, I suggest adding the following items:
>>>> 
>>>> database timeseries iot iov big-data Java
>>>> 
>>>> Any other Suggestions?
>>> 



About changing Github description topics

2020-02-08 Thread atoiLiu
Hi,

I noticed that IotDB's github topics were not set, which would greatly affect 
the search.

If I'm a person looking for a management solution for time series data, 
when I search for timeseries on github, IoTDB will not recommended as relevant 
content to users.

I've looked at other apache projects, and they all have Settings, like Flink, 
and what they set is:
Scala Java big-data flink

but we only set up IoTDB.

Therefore, I suggest adding the following items:

database timeseries iot iov big-data Java

Any other Suggestions?

Re: [VOTE] Enable github issue

2020-01-03 Thread atoiLiu
Hi,
+1 
Regards,

> 在 2020年1月3日,下午5:33,Jialin Qiao  写道:
> 
> Hi,
> 
> I'd like to call a vote for enabling github issue.
> 
> The github issues could be treated as user mail list because Chinese users
> prefer github issue more than Jira. However, for more convenient project
> management consideration, we also use Jira.
> 
> We could manage github and jira as follows:
> 
> 【Create GitHub issue】Users create github issues when they want to report a
> bug or new feature. Discussion is performed under github issues.
> 
> 【Create Jira issue】 If a github issue is valuable, we create a Jira issue
> for it and link the github issue url. After that, we add a [IOTDB-xxx]
> prefix in the title of github issue to mark it is already related to a Jira
> issue.
> 
> 【Create PR】 Each functional PR or bug-fix PR should relate to a Jira issue.
> PMCs or committers could help link PR with Jira issues. Some minor PRs
> could be merged without a link to Jira issues, such as fix a typo.
> 
> 【Close Jira Issue、Close github Issue、Merge PR】 Before mergine a PR, the
> contributor or the one merges the PR should mark the fixed-version in Jira
> issue and close the github issue if it is related.
> 
> The vote is open for the next 72 hours and passes if at least three +1
> votes and more +1 votes than -1 votes.
> 
> Please vote accordingly:
> 
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove with the reason
> 
> Best,
> --
> Jialin Qiao
> School of Software, Tsinghua University
> 
> 乔嘉林
> 清华大学 软件学院



Re: [DISCUSS] Enable github issue?

2020-01-03 Thread atoiLiu
Hi,
I agree with Lei Rui opinion, and it should clearly describe how we work after 
opening issue

Regards,


> 在 2020年1月3日,下午4:49,Lei Rui  写道:
> 
> I would suggest a formal vote.
> 
> 
> Regards,
> Lei Rui
> 
> 
> On 1/3/2020 16:41,Jialin Qiao wrote:
> Hi,
> 
> In the discussion about enabling github issues, 5 people support while no
> other voices.
> 
> Jialin Qiao
> Minhao Gouwang
> Xiangdong Huang
> Tianci Zhu
> Dawei Liu
> 
> I think there is no need to call for a vote. Meanwhile, Jira is also used
> and we could carry the issues to Jira through some tools.
> 
> Could someone help enable the github issues?
> 
> Thanks,
> 
> Jialin Qiao  于2020年1月2日周四 下午4:29写道:
> 
> Hi,
> 
> Let's continue this discussion or make a decision?
> 
> Jialin Qiao  于2019年12月31日周二 下午7:28写道:
> 
> Hi,
> 
> +1 for managing issues in github.
> +0 for using Jira.
> 
> Thanks
> Jialin Qiao
> 
> 
> 
> --
> —
> Jialin Qiao
> School of Software, Tsinghua University
> 
> 乔嘉林
> 清华大学 软件学院
> 
> 
> 
> --
> —
> Jialin Qiao
> School of Software, Tsinghua University
> 
> 乔嘉林
> 清华大学 软件学院




Re: 反馈一些iotdb的一些问题

2019-12-24 Thread atoiLiu
Hi,

Can you provide the version you are currently using? You can see the version 
after client-shell starts, or use SQL 'show version' to view it。

您好,您可以提供一下您当前使用的版本么,版本信息可以在启动client-shell时候打印出来,或者执行SQL `show version` 来查看

Best

> 在 2019年12月24日,下午4:52,Robin  写道:
> 
> iotdb开发组的各位你们好
>我公司现在这在使用你们的产品做相关的应用,现在的数据量大概有2000万左右。
> 不过在使用之中发现了一些小问题,希望你们可以帮忙解惑。.
> 比如在我使用下面的sql查询时正常执行
> select count(preMeter) from 
> root.alap.a510114.s5101141023.p5101141023100310.g2 GROUP BY(1d, 
> [157475377, 157492657]);
> 但是当我把count()函数写错成count1()时,在使用上面的正确的sql就不能正常查询出数据了。需要重启服务端才能正常运行。
> 还有就是在你们的官方给的文档中一些语句执行
> 
>   COUNT TIMESERIES root   统计时间序列数
>   SHOW DEVICE 
> 显示设备
>   Fill
> 所有的fill函数
>   DELETE STORAGE GROUP root.ln.wf01.wt01  删除新建的存储组
> 
> 
>  在最后我希望可以能够加到你们的实时通讯联系方式,这样如果有问题也可以方便请教。
>  在此真诚的感谢。
> 



Re: Who can review this pr?

2019-12-24 Thread atoiLiu
Hi ,
Thank you very much for your reply. 
I have revised all the Suggestions you proposed, and added the modification of 
the document after your review. 
If you have time, please check the new modification.

Best

> 在 2019年12月24日,下午5:39,Jialin Qiao  写道:
> 
> Hi,
> 
> Thanks for your contribution :)  I have reviewed your PR and give some
> advice.
> 
> Best,
> Jialin Qiao
> 
> atoiLiu  于2019年12月24日周二 下午5:35写道:
> 
>> Hi ,
>> The document has not been updated for a long time. I found some problems
>> in the process of reading the document today and made some changes.
>> 
>> 1.add Frequently asked questions CN doc
>> 2.add Docker Image CN doc
>> 3.add Programming-JDBC CN doc
>> 4.add TsFile API CN doc
>> 5.some mini modifications
>> 
>> https://github.com/apache/incubator-iotdb/pull/674 <
>> https://github.com/apache/incubator-iotdb/pull/674>
>> 
>> Best
> 
> 
> 
> -- 
> —
> Jialin Qiao
> School of Software, Tsinghua University
> 
> 乔嘉林
> 清华大学 软件学院




Who can review this pr?

2019-12-24 Thread atoiLiu
Hi ,
The document has not been updated for a long time. I found some problems in the 
process of reading the document today and made some changes.

1.add Frequently asked questions CN doc
2.add Docker Image CN doc
3.add Programming-JDBC CN doc
4.add TsFile API CN doc
5.some mini modifications

https://github.com/apache/incubator-iotdb/pull/674 


Best

Re: About the iotdb website

2019-12-12 Thread atoiLiu
Hi,
> are you good at building a website, and  interested in that?
This is not the direction I am good at, but I really want to do more for the 
community

> 在 2019年12月12日,下午4:20,Xiangdong Huang  写道:
> 
> Hi,
> 
> +1.
> 
> @atoiLiu,  are you good at building a website, and  interested in that?
> 
> Best,
> ---
> Xiangdong Huang
> School of Software, Tsinghua University
> 
> 黄向东
> 清华大学 软件学院
> 
> 
> Jialin Qiao  于2019年12月12日周四 下午4:16写道:
> 
>> Hi,
>> 
>> +1 for updating the website.
>> 
>> Thanks,
>> Jialin Qiao
>> 
>> atoiLiu  于2019年12月12日周四 下午3:41写道:
>> 
>>> Hi,
>>> I do n’t know if you invited friends to open the official website of
>>> iotdb, did they encounter any problems?
>>> 1. The official website style is very old and feels like a framework
>>> website opened a few years ago
>>> 2. The website opens slowly, and the carousel pictures sometimes cannot
>> be
>>> displayed correctly.
>>> 3. No internationalization, making reading relatively difficult
>>> 
>>> I think the site may need to be updated to attract new customers
>>> 
>>> Here are some really cool and fast websites:
>>> [1] http://servicecomb.apache.org/cn/ <http://servicecomb.apache.org/cn/
>>> 
>>> [2] https://pingcap.com/ <https://pingcap.com/>
>>> [3] http://dubbo.apache.org/zh-cn/ <http://dubbo.apache.org/zh-cn/>
>> 
>> 
>> 
>> --
>> —
>> Jialin Qiao
>> School of Software, Tsinghua University
>> 
>> 乔嘉林
>> 清华大学 软件学院
>> 



About the iotdb website

2019-12-11 Thread atoiLiu
Hi,
I do n’t know if you invited friends to open the official website of iotdb, did 
they encounter any problems?
1. The official website style is very old and feels like a framework website 
opened a few years ago
2. The website opens slowly, and the carousel pictures sometimes cannot be 
displayed correctly.
3. No internationalization, making reading relatively difficult

I think the site may need to be updated to attract new customers

Here are some really cool and fast websites:
[1] http://servicecomb.apache.org/cn/ 
[2] https://pingcap.com/ 
[3] http://dubbo.apache.org/zh-cn/ 

Re: question about Apache Jenkins and Sonar

2019-12-11 Thread atoiLiu
Hi,
Perhaps this token is not a required parameter or instead of using a personal 
account, how about using an account specifically created for ci?

> 在 2019年12月12日,下午2:02,Xiangdong Huang  写道:
> 
> Hi,
> 
> The analysis repo on SounarCloud has been created [1].
> 
> I read the guide [2] and the example of PLC4x [3]  and Sling projects.
> I noticed that all of them mentioned "sonar_token", e.g., "
> withCredentials([string(credentialsId: 'chris-sonarcloud-token', variable: '
> SONAR_TOKEN')]".
> 
> I have created a token called xiangdong-iotdb-sonarcloud-token, but my
> question is, don't I need to put the value of the token into the
> configuration file? If I publish the token value, is that suitable?
> (According to my understanding, the token should be protected as a privacy).
> 
> (I am trying how to config can work. But if someone can give a guide, it
> will be very helpful :-D ).
> 
> [1] https://sonarcloud.io/dashboard?id=apache_incubator-iotdb
> [2] https://cwiki.apache.org/confluence/display/INFRA/SonarQube+Analysis
> [3] https://github.com/apache/plc4x/blob/develop/Jenkinsfile#L124
> 
> Best,
> ---
> Xiangdong Huang
> School of Software, Tsinghua University
> 
> 黄向东
> 清华大学 软件学院
> 
> 
> Xiangdong Huang  于2019年12月1日周日 下午1:57写道:
> 
>> Hi,
>> 
>> thanks Chris and Willem.
>> I have created a jira ticket for applying creating a project on
>> sonarcloud.io [1].
>> Before the application is complete, I disable the sonar analysis from
>> jenkins temporary.
>> 
>> [1] https://issues.apache.org/jira/browse/INFRA-19507
>> ---
>> Xiangdong Huang
>> School of Software, Tsinghua University
>> 
>> 黄向东
>> 清华大学 软件学院
>> 
>> 
>> Willem Jiang  于2019年12月1日周日 上午9:39写道:
>> 
>>> You need to some setup[1] to enable the Sonar Cloud Service for Apache
>>> project.
>>> 
>>> [1]https://cwiki.apache.org/confluence/display/INFRA/SonarQube+Analysis
>>> 
>>> Willem Jiang
>>> 
>>> Twitter: willemjiang
>>> Weibo: 姜宁willem
>>> 
>>> Willem Jiang
>>> 
>>> Twitter: willemjiang
>>> Weibo: 姜宁willem
>>> 
>>> On Sat, Nov 30, 2019 at 10:31 PM Christofer Dutz
>>>  wrote:
 
 Hi Xiangdong,
 
 The ASF SonarCube instance is no longer being run.
 The build has to be changed to SounarCloud.
 
 Have a look at the PLC4X build (Jenkinsfile).
 We did the change there some time ago.
 
 Chris
 
 Am 29.11.19, 17:24 schrieb "Xiangdong Huang" :
 
Hi,
 
I find Apache Jenkins build failed because "SonarQube installation
>>> defined
in this job (ASF Sonar Analysis) does not match any configured
installation. Number of installations that can be configured: 0."
 
I checked recent commits, and find the most possible code
>>> modification is
that  `vulnerability-checks` is moved to `apache-release` profile.
 
So, is this task who triggers Jenkins to submit a job to SonarQube?
 
If so, you'd better revoke your modification on the pom file,
>>> @jialin Qiao.
 
Best,
--
Xiangdong Huang
School of Software, Tsinghua University
 
 黄向东
清华大学 软件学院
 
 
>>> 
>> 



Re: [Discuss] about collecting info to know who are using IoTDB

2019-12-11 Thread atoiLiu
Add Guestbook 

 in http://iotdb.incubator.apache.org/#/ 
open issue on GitHub 


> 在 2019年12月11日,下午2:24,Xiangdong Huang  > 写道:
> 
> Hi,
> 
> I notice that ShardingSpehere incubating project has a page for collecting
> who are using IoTDB[1].  They use github issue to collect the data [2].
> 
> I think it is helpful for the project graduation
> 
> So, if IoTDB also collects the info, how to do that? Any ideas?
> 
> 1. We do not open the issue module on Github.
> 2. For Chinese users, many of them do not have a Apache JIRA account and
> opening the jira may requires some skills in China because of Chinese
> network problem.
> 
> So, how about opening a tickets on jira for collecting users out of China,
> and opening a online document, e.g., QQ document, to collect users in China
> (But it is hard to record that who maintains the document. (Everyone can
> modify the document, while on JIRA and github, the contents is binding with
> accounts)).
> 
> [1] https://shardingsphere.apache.org/community/en/poweredby/ 
> 
> [2] https://github.com/sharding-sphere/sharding-sphere/issues/234 
> 
> 
> Best,
> ---
> Xiangdong Huang
> School of Software, Tsinghua University
> 
> 黄向东
> 清华大学 软件学院



Re: StorageGroupProcessor.sequenceFileList is ordered by fileName rather than dataTime

2019-12-10 Thread atoiLiu
Hi,

I think the semantics of load are the same as insert, except this insert is a 
sealed file, so I think it should be dumped into iotdb as an unseq file and 
sorted in memory with the original files.

This may cause queries to be very slow, but we should prompt the user to do a 
merge command ??

> 在 2019年12月10日,下午9:04,Xiangdong Huang  写道:
> 
> Hi,
> 
> I think it is a bug in the `load` function now, and needs to be fixed
> quickly.
> 
> Firstly, let's consider that there is no `load` function.
> In this case, the files will have the same order no matter you use which
> device's timeline as the ordering dimension.
> 
> (Second, in your case, can we put the tsfile 105 into the sequence files?
> Condition: all devices in a flushing memetable can be set in a time hole of
> the sequence files.)
> 
> Third, lets's consider that if the `load` function is enable.
> 
> The worest case is that you add a file  which has two devices (device 1 and
> device2), and if you use device1's timeline to order files, it is between
> F2 and F3, while it is between F1 and F2 if you use device2's timeline.
> 
> device1: F1   F2   _HOLE__ F3
> device2: F1  __HOLE__ F2  F3
> 
> Then, why not split the file into two files?
> 
> Best,
> ---
> Xiangdong Huang
> School of Software, Tsinghua University
> 
> 黄向东
> 清华大学 软件学院
> 
> 
> Jialin Qiao  于2019年12月10日周二 下午7:05写道:
> 
>> Hi,
>> 
>> Things become complicated when the load file feature is introduced in
>> IoTDB. The newly added data file may contain many devices with different
>> time intervals. Therefore, one order of TsFileResources is insufficient.
>> A possible solution is to sort the TsFileResources temporarily when
>> querying.
>> 
>> Thanks,
>> Jialin Qiao
>> 
>> Lei Rui (Jira)  于2019年12月9日周一 上午12:14写道:
>> 
>>> Lei Rui created IOTDB-346:
>>> -
>>> 
>>> Summary: StorageGroupProcessor.sequenceFileList is ordered
>> by
>>> fileName rather than dataTime
>>> Key: IOTDB-346
>>> URL: https://issues.apache.org/jira/browse/IOTDB-346
>>> Project: Apache IoTDB
>>>  Issue Type: Bug
>>>Reporter: Lei Rui
>>> 
>>> 
>>> `StorageGroupProcessor.sequenceFileList` is ordered by fileName rather
>>> than by time of data, as reflected in the
>>> `StorageGroupProcessor.getAllFiles` method code:
>>> {code:java}
>>> tsFiles.sort(this::compareFileName);
>>> {code}
>>> 
>>> I use the following examples to expose the bug when the order of fileName
>>> is inconsistent with that of dataTime.
>>> 
>>> First, for preparation, I created three tsfiles using the following sql:
>>> {code:java}
>>> SET STORAGE GROUP TO root.ln.wf01.wt01
>>> CREATE TIMESERIES root.ln.wf01.wt01.status WITH DATATYPE=BOOLEAN,
>>> ENCODING=PLAIN
>>> CREATE TIMESERIES root.ln.wf01.wt01.temperature WITH DATATYPE=DOUBLE,
>>> ENCODING=PLAIN
>>> CREATE TIMESERIES root.ln.wf01.wt01.hardware WITH DATATYPE=INT32,
>>> ENCODING=PLAIN
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(1, 1.1, false, 11)
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(2, 2.2, true, 22)
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(3, 3.3, false, 33)
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(4, 4.4, false, 44)
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(5, 5.5, false, 55)
>>> flush
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(100, 100.1, false, 110)
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(150, 200.2, true, 220)
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(200, 300.3, false, 330)
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(250, 400.4, false, 440)
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(300, 500.5, false, 550)
>>> flush
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(10, 10.1, false, 110)
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(20, 20.2, true, 220)
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(30, 30.3, false, 330)
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(40, 40.4, false, 440)
>>> INSERT INTO root.ln.wf01.wt01(timestamp,temperature,status, hardware)
>>> values(50, 50.5, false, 550)
>>> flush
>>> {code}
>>> The tsfiles created are organized in the following directory structure:
>>> {code:java}
>>> |data
>>> |--sequence
>>> |root.ln.wf01.wt01
>>> |--1575813520203-101-0.tsfile
>>> |--1575813520203-101-0.tsfile.resource
>>> |--1575813520669-103-0.tsfile
>>> |--1575813520669-103-0.tsfile.resource
>>> |--unsequence
>>> |root.ln.wf01.wt01
>>> 

Re: Code refactoring of Query

2019-12-10 Thread atoiLiu
hi,
Offset command is sometimes found in SQL, so hopefully it will also provide a 
jump query


> 在 2019年12月10日,下午7:16,Jialin Qiao  写道:
> 
> Hi,
> 
> Code refactoring is inevitable when building a large system. The read/write
> of TsFile, storage engine of the server have been refactored. Now, it's
> time to refactor the query engine in the server.
> 
> Currently, the query is in a tuple-at-a-time manner. In the meantime, the
> interface is in chaos.
> To improve the query speed, a batch-at-a-time(vectorization) iteration is
> needed.
> 
> I have opened a branch with Lei Rui: f_batch_reader. The existing
> interfaces are simplified and IBatchReader is added. Welcome to work on
> this branch for query optimization.
> 
> Thanks,
> —
> Jialin Qiao
> School of Software, Tsinghua University
> 
> 乔嘉林
> 清华大学 软件学院



Add data to TSExecuteStatementResp

2019-12-09 Thread atoiLiu
Currently, when executing an SQL statement through JDBC, it is done in two 
steps:

1. Query metadata set hasResultSet = true

2. When the client determines that hasResultSet = true, it will initiate the 
next data query and call the fetchResult method of the server

I think this step can be optimized to reduce one request to the server, 
especially if the data is empty or the total amount of data is less than the 
fetchSize

So I added the TSQueryDataSet to TSExecuteStatementResp so that the client 
could traverse the data directly.



I hope my idea can contribute to the community. Can anyone review it for me?

Pr:
https://github.com/apache/incubator-iotdb/pull/631 



In addition, I found a new problem. 
When I input a random random SQL in client, the server would throw an antlr 
error, which could not be caught by the expected SQLParserException and was 
kindly prompted to the user. 
I think adding  try...catch to the parseSQLToPhysicalPlan should solve this 
problem.  I don’t know if i am right ,so sorry.