[ https://issues.apache.org/jira/browse/DERBY-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16061206#comment-16061206 ]
Harshvardhan Gupta commented on DERBY-6940: ------------------------------------------- Hi Bryan, I thought of a workaround and was successful. In particular, I am comparing the maxVal and minVal and if both are equal I first write an indicator boolean and then write only one DataValueDescriptor object. In all other cases, I first write maxVal and then minVal, In this way the problematic object will always be written last once. public void writeExternal(ObjectOutput out) throws IOException { FormatableHashtable fh = new FormatableHashtable(); fh.putLong("numRows", numRows); fh.putLong("numUnique", numUnique); fh.putLong("nullCount", nullCount); out.writeObject(fh); try{ if (maxVal.equals(maxVal, minVal).getBoolean()) { out.writeBoolean(true); out.writeObject(minVal); return; } } catch(StandardException e){ } finally { out.writeBoolean(false); out.writeObject(maxVal); out.writeObject(minVal); } } public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException { FormatableHashtable fh = (FormatableHashtable)in.readObject(); numRows = fh.getLong("numRows"); numUnique = fh.getLong("numUnique"); nullCount = fh.getLong("nullCount"); if(in.readBoolean()){ maxVal = (DataValueDescriptor)in.readObject(); minVal = maxVal.cloneValue(true); } else{ maxVal = (DataValueDescriptor) in.readObject(); minVal = (DataValueDescriptor) in.readObject(); } } > Enhance derby statistics for more accurate selectivity estimates. > ----------------------------------------------------------------- > > Key: DERBY-6940 > URL: https://issues.apache.org/jira/browse/DERBY-6940 > Project: Derby > Issue Type: Sub-task > Components: SQL > Reporter: Harshvardhan Gupta > Assignee: Harshvardhan Gupta > Priority: Minor > Attachments: DERBY-6940_2.diff, DERBY-6940_3.diff, derby-6940.diff, > EOFException_derby.log, EOFException.txt > > > Derby should collect extra statistics during index build time, statistics > refresh time which will help optimizer make more precise selectivity > estimates and chose better execution paths. > We eventually want to utilize the new statistics to make better selectivity > estimates / cost estimates that will help find the best query plan. Currently > Derby keeps two type of stats - the total row count and the number of unique > values. > We are initially extending the stats to include null count, the minimum value > and maximum value associated with each of the columns of an index. This would > be useful in selectivity estimates for operators such as [ IS NULL, <, <=, >, > >= ] , all of which currently rely on hardwired selectivity estimates. -- This message was sent by Atlassian JIRA (v6.4.14#64029)