[jira] [Updated] (DERBY-6937) Load the IMDB data set in Derby, obtain and adapt Join order Benchmark queries for use in derby

Harshvardhan Gupta (JIRA) Mon, 05 Jun 2017 10:21:05 -0700

     [ 
https://issues.apache.org/jira/browse/DERBY-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Harshvardhan Gupta updated DERBY-6937:
--------------------------------------
    Attachment: derby_script.sql
                imdb.diff

Please find the attached files. 'derby_script.sql' contains the exact script 
used by me to set up the tables. The other files contain the changes in 
ImportReadData.java. 

There are 2 major changes in ImportReadData - 
1) Handling NULL values as discussed by me earlier.
2) Handling the escape characters.

Errors I saw related to 2) are discussed here as well - 
http://apache-database.10148.n7.nabble.com/Data-found-after-the-stop-delimiter-td100312.html.

I handled escape character in derby only, other solutions like pre-processing 
data externally exists as in case of handling NULL values.

'schema_derby.sql'  and  'schematext.sql' that came with dataset are mostly 
same other than the fields where the data type is just 'character varying' 
without specifying max length. For those columns, I have used 'CLOB' data type 
in derby as semantically equivalent to that of 'character varying' of undefined 
length in postgres.

You should be able to apply the imdb.diff patch and import the data without any 
problems now.

Thanks,
Vardhan





> Load the IMDB data set in Derby, obtain and adapt Join order Benchmark 
> queries for use in derby 
> ------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-6937
>                 URL: https://issues.apache.org/jira/browse/DERBY-6937
>             Project: Derby
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Harshvardhan Gupta
>            Assignee: Harshvardhan Gupta
>            Priority: Minor
>         Attachments: derby_script.sql, imdb.diff, schema_derby.sql
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (DERBY-6937) Load the IMDB data set in Derby, obtain and adapt Join order Benchmark queries for use in derby

Reply via email to