[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-03-06 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18051:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

pushed to master. Thank you [~abstractdog] for taking care of this and Peter 
for reviewing the changes!

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch, HIVE-18051.05.patch, 
> HIVE-18051.06.patch, HIVE-18051.07.patch, HIVE-18051.08.patch, 
> HIVE-18051.09.patch, HIVE-18051.10.patch, HIVE-18051.11.patch, 
> HIVE-18051.12.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-03-04 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.12.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch, HIVE-18051.05.patch, 
> HIVE-18051.06.patch, HIVE-18051.07.patch, HIVE-18051.08.patch, 
> HIVE-18051.09.patch, HIVE-18051.10.patch, HIVE-18051.11.patch, 
> HIVE-18051.12.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-03-02 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.11.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch, HIVE-18051.05.patch, 
> HIVE-18051.06.patch, HIVE-18051.07.patch, HIVE-18051.08.patch, 
> HIVE-18051.09.patch, HIVE-18051.10.patch, HIVE-18051.11.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-02-17 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.10.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch, HIVE-18051.05.patch, 
> HIVE-18051.06.patch, HIVE-18051.07.patch, HIVE-18051.08.patch, 
> HIVE-18051.09.patch, HIVE-18051.10.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-02-16 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.09.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch, HIVE-18051.05.patch, 
> HIVE-18051.06.patch, HIVE-18051.07.patch, HIVE-18051.08.patch, 
> HIVE-18051.09.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-02-02 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.08.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch, HIVE-18051.05.patch, 
> HIVE-18051.06.patch, HIVE-18051.07.patch, HIVE-18051.08.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-27 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.07.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch, HIVE-18051.05.patch, 
> HIVE-18051.06.patch, HIVE-18051.07.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-27 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: (was: HIVE-18051.07.patch)

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch, HIVE-18051.05.patch, 
> HIVE-18051.06.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-27 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.07.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch, HIVE-18051.05.patch, 
> HIVE-18051.06.patch, HIVE-18051.07.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-27 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: (was: HIVE-18051.07.patch)

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch, HIVE-18051.05.patch, 
> HIVE-18051.06.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-27 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.07.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch, HIVE-18051.05.patch, 
> HIVE-18051.06.patch, HIVE-18051.07.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-19 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.06.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch, HIVE-18051.05.patch, 
> HIVE-18051.06.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-12 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.05.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch, HIVE-18051.05.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-11 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.04.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch, HIVE-18051.04.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-09 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.03.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-09 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: (was: HIVE-18051.03.patch)

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-09 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.03.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch, 
> HIVE-18051.03.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-09 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.02.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
> Attachments: HIVE-18051.01.patch, HIVE-18051.02.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-08 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Status: Patch Available  (was: Open)

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
> Attachments: HIVE-18051.01.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18051) qfiles: dataset support

2018-01-08 Thread Laszlo Bodor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-18051:

Attachment: HIVE-18051.01.patch

> qfiles: dataset support
> ---
>
> Key: HIVE-18051
> URL: https://issues.apache.org/jira/browse/HIVE-18051
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
> Attachments: HIVE-18051.01.patch
>
>
> it would be great to have some kind of test dataset support; currently there 
> is the {{q_test_init.sql}} which is quite large; and I'm often override it 
> with an invalid string; because I write independent qtests most of the time - 
> and the load of {{src}} and other tables are just a waste of time for me ; 
> not to mention that the loading of those tables may also trigger breakpoints 
> - which is a bit annoying.
> Most of the tests are "only" using the {{src}} table and possibly 2 others; 
> however the main init script contains a bunch of tables - meanwhile there are 
> quite few other tests which could possibly also benefit from a more general 
> feature; for example the creation of {{bucket_small}} is present in 20 q 
> files.
> the proposal would be to enable the qfiles to be annotated with metadata like 
> datasets:
> {code}
> --! qt:dataset:src,bucket_small
> {code}
> proposal for storing a dataset:
> * the loader script would be at: {{data/datasets/__NAME__/load.hive.sql}}
> * the table data could be stored under that location
> a draft about this; and other qfiles related ideas:
> https://docs.google.com/document/d/1KtcIx8ggL9LxDintFuJo8NQuvNWkmtvv_ekbWrTLNGc/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)