[ 
https://issues.apache.org/jira/browse/DERBY-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16061206#comment-16061206
 ] 

Harshvardhan Gupta commented on DERBY-6940:
-------------------------------------------

Hi Bryan,

I thought of a workaround and was successful. In particular, I am comparing the 
maxVal and minVal and if both are equal I first write an indicator boolean and 
then write only one DataValueDescriptor object. In all other cases, I first 
write maxVal and then minVal, In this way the problematic object will always be 
written last once.

public void writeExternal(ObjectOutput out)
                 throws IOException
        {
                FormatableHashtable fh = new FormatableHashtable();
                fh.putLong("numRows", numRows);
                fh.putLong("numUnique", numUnique);
                fh.putLong("nullCount", nullCount);
                out.writeObject(fh);
                try{
                        if (maxVal.equals(maxVal, minVal).getBoolean()) {
                                out.writeBoolean(true);
                                out.writeObject(minVal);
                                return;
                        }
                }
                catch(StandardException e){

                }
                finally {
                        out.writeBoolean(false);
                        out.writeObject(maxVal);
                        out.writeObject(minVal);
                }
        }



        public void readExternal(ObjectInput in)
                throws IOException, ClassNotFoundException
        {
                FormatableHashtable fh = (FormatableHashtable)in.readObject();
                numRows = fh.getLong("numRows");
                numUnique = fh.getLong("numUnique");
                nullCount = fh.getLong("nullCount");
                if(in.readBoolean()){
                        maxVal = (DataValueDescriptor)in.readObject();
                        minVal = maxVal.cloneValue(true);
                }
                else{
                        maxVal = (DataValueDescriptor) in.readObject();
                        minVal = (DataValueDescriptor) in.readObject();
                }
        }

> Enhance derby statistics for more accurate selectivity estimates.
> -----------------------------------------------------------------
>
>                 Key: DERBY-6940
>                 URL: https://issues.apache.org/jira/browse/DERBY-6940
>             Project: Derby
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Harshvardhan Gupta
>            Assignee: Harshvardhan Gupta
>            Priority: Minor
>         Attachments: DERBY-6940_2.diff, DERBY-6940_3.diff, derby-6940.diff, 
> EOFException_derby.log, EOFException.txt
>
>
> Derby should collect extra statistics during index build time, statistics 
> refresh time which will help optimizer make more precise selectivity 
> estimates and chose better execution paths.
> We eventually want to utilize the new statistics to make better selectivity 
> estimates / cost estimates that will help find the best query plan. Currently 
> Derby keeps two type of stats - the total row count and the number of unique 
> values.
> We are initially extending the stats to include null count, the minimum value 
> and maximum value associated with each of the columns of an index. This would 
> be useful in selectivity estimates for operators such as [ IS NULL, <, <=, >, 
> >= ] , all of which currently rely on hardwired selectivity estimates.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to