[jira] [Commented] (DERBY-4708) In the Administration Guide, clarify that you need to adjust file permissions in your security policy in order to prevent import/export from accessing sensitive files o

2017-06-14 Thread Rick Hillegas (JIRA)

[ 
https://issues.apache.org/jira/browse/DERBY-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049899#comment-16049899
 ] 

Rick Hillegas commented on DERBY-4708:
--

This issue was tracked by CVE-2010-2232. See 
https://issues.apache.org/jira/browse/DERBY-2925?focusedCommentId=16049897=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16049897

> In the Administration Guide, clarify that you need to adjust file permissions 
> in your security policy in order to prevent import/export from accessing 
> sensitive files outside your Derby subsystem
> ---
>
> Key: DERBY-4708
> URL: https://issues.apache.org/jira/browse/DERBY-4708
> Project: Derby
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 10.6.1.0
>Reporter: Rick Hillegas
>Assignee: Rick Hillegas
> Fix For: 10.6.2.1, 10.7.1.1
>
> Attachments: derby-4708-01-aa-clarification.diff, 
> derby-4708-01-aa-clarification.tar, derby-4708-01-ab-clarification.diff
>
>
> Right now the Derby Administration Guide advises users to adjust permissions 
> in their security policy file in order to prevent backup/restore from 
> clobbering and inspecting sensitive files outside the Derby subsystem. This 
> advice can be found in the section titled "Basic Network Server security 
> policy". This section should be clarified to note that you can suffer similar 
> exposure from the export/import procedures and that you need to adjust your 
> security policy for them as well.
> Note that this section does link to another, detailed section, which 
> describes the security policy implications for both backup/restore and 
> export/import: "Customizing the Network Server's security policy".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DERBY-2925) Prevent export from overwriting existing files

2017-06-14 Thread Rick Hillegas (JIRA)

[ 
https://issues.apache.org/jira/browse/DERBY-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049897#comment-16049897
 ] 

Rick Hillegas commented on DERBY-2925:
--

This issue was tracked by CVE-2010-2232 along with the documentation 
improvement at https://issues.apache.org/jira/browse/DERBY-4708. The fixes 
appeared in Derby version10.6.2.1 (see 
http://db.apache.org/derby/releases/release-10.6.2.1.html), which was released 
on 2010-10-05.

> Prevent export from overwriting existing files
> --
>
> Key: DERBY-2925
> URL: https://issues.apache.org/jira/browse/DERBY-2925
> Project: Derby
>  Issue Type: Sub-task
>  Components: Tools
>Affects Versions: 10.1.2.1, 10.2.2.0, 10.3.1.4, 10.4.1.3
>Reporter: Kathey Marsden
>Assignee: Ramin Moazeni
> Fix For: 10.3.1.4, 10.4.1.3, 10.6.2.1, 10.7.1.1
>
> Attachments: derby-2925-07-aa-fileUrl.diff, DERBY-2925v0.diff, 
> DERBY-2925v0.stat, DERBY-2925v1.diff, DERBY-2925v1.stat, DERBY-2925v2.diff, 
> DERBY-2925v2.stat, DERBY-2925v3.diff, DERBY-2925v3.stat, DERBY-2925v4.diff, 
> DERBY-2925v4.stat, DERBY-2925v5.diff, DERBY-2925v5.stat, DERBY-2925v6.diff, 
> DERBY-2925v6.stat, releaseNote.html, releaseNotev0.html
>
>
> Export should not overwrite existing files, but rather insist that the user 
> remove them before writing to the file.  This will help prevent accidental or 
> intentional corruption of the database with export.  This may introduce a 
> compatibility issue with export but because export is usually an attended 
> utility and not typically invoked as part of an application, I think the risk 
> is worth the additional security this will provide.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DERBY-6938) Obtain cardinality estimates and true estimates for base tables as well as for intermediate results for queries involving multiple joins.

2017-06-14 Thread Harshvardhan Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/DERBY-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048798#comment-16048798
 ] 

Harshvardhan Gupta commented on DERBY-6938:
---

The specific approach I am thinking is to keep the minimum and maximum value of 
columns and number of NULL values in statistics, this could be utilised in 
operators such as (< , > , <=, >=, IS NOT NULL, NULL) etc.

For example, lets say we have a int column and the minimum and maximum value is 
20 and 100 respectively. Then for a query predicate on that column with the 
condition that >=80 should ideally return 25% of all columns. This approach 
obviously assumes an uniform distribution but should be good to get started 
with. We should be able to make it more efficient by taking into account 
distribution later on.

>  Obtain cardinality estimates and true estimates for base tables as well as 
> for intermediate results for queries involving multiple joins. 
> ---
>
> Key: DERBY-6938
> URL: https://issues.apache.org/jira/browse/DERBY-6938
> Project: Derby
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Harshvardhan Gupta
>Assignee: Harshvardhan Gupta
> Attachments: explain.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DERBY-6938) Obtain cardinality estimates and true estimates for base tables as well as for intermediate results for queries involving multiple joins.

2017-06-14 Thread Harshvardhan Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/DERBY-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048793#comment-16048793
 ] 

Harshvardhan Gupta commented on DERBY-6938:
---

Bryan,
Regarding my doubt earlier, one thing that was particularly useful to dive deep 
into the optimizer was to enable optimizer tracing.
https://wiki.apache.org/db-derby/OptimizerTracing

The trace output is quite verbose and helps to understand the various choices 
the optimizer is making.

Few observations and scope of improvements that I would like to point out - 

1) Derby falls back to nested loops more often that we would like to 
particularly in case of large tables, currently the hash table resides entirely 
in memory and derby rules out the HASHJOIN approach if it suspects that it is 
going to be too large (default is 1048576)
Nested loops do not seem to be a good option specially when joining relatively 
large tables (similar to imdb dataset we are using) across more than 4 joins.

It is also documented in the optimizer paper that creating hash tables that 
spill to disk is a potential improvement and my experiments confirm that.

2) Another potential improvement with regards to cardinality estimates. Derby 
currently uses hard wired numbers for every operator other than the equality op 
for selectivity.
https://db.apache.org/derby/docs/10.0/manuals/tuning/perf56.html

In case of equality operator with a known value at compile time, it utilises 
statistics and make selectivity assumptions using number of unique values. I 
think we can enhance the statistics to be able to make better cardinality 
estimates. 


>  Obtain cardinality estimates and true estimates for base tables as well as 
> for intermediate results for queries involving multiple joins. 
> ---
>
> Key: DERBY-6938
> URL: https://issues.apache.org/jira/browse/DERBY-6938
> Project: Derby
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Harshvardhan Gupta
>Assignee: Harshvardhan Gupta
> Attachments: explain.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DERBY-6938) Obtain cardinality estimates and true estimates for base tables as well as for intermediate results for queries involving multiple joins.

2017-06-14 Thread Harshvardhan Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/DERBY-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048780#comment-16048780
 ] 

Harshvardhan Gupta commented on DERBY-6938:
---

To view and compare the estimates row counts and true row counts for base 
tables and for intermediate results the following queries can be used on xplain 
tables, the value of OP_IDENTIFIER can be changed to get data for nodes for a 
particular operation such as HASHJOIN, NLJOIN etc  

select SEEN_ROWS, SEEN_ROWS_RIGHT, RETURNED_ROWS, EST_ROW_COUNT  from 
SYSXPLAIN_RESULTSETS,SYSXPLAIN_STATEMENTS where OP_IDENTIFIER = 'HASHJOIN' and 
SYSXPLAIN_STATEMENTS.STMT_ID = SYSXPLAIN_RESULTSETS.STMT_ID;

To couple the scan information for nodes involved in scans - 

select SEEN_ROWS, RETURNED_ROWS, EST_ROW_COUNT , OP_IDENTIFIER from 
SYSXPLAIN_STATEMENTS, SYSXPLAIN_RESULTSETS, SYSXPLAIN_SCAN_PROPS where 
SYSXPLAIN_STATEMENTS.STMT_ID = SYSXPLAIN_RESULTSETS.STMT_ID and 
SYSXPLAIN_RESULTSETS.SCAN_RS_ID = SYSXPLAIN_SCAN_PROPS.SCAN_RS_ID and 
OP_IDENTIFIER like '%SCAN';






>  Obtain cardinality estimates and true estimates for base tables as well as 
> for intermediate results for queries involving multiple joins. 
> ---
>
> Key: DERBY-6938
> URL: https://issues.apache.org/jira/browse/DERBY-6938
> Project: Derby
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Harshvardhan Gupta
>Assignee: Harshvardhan Gupta
> Attachments: explain.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)