Just reading through some of the open POI bugs. Looks like this work is related to https://bz.apache.org/bugzilla/show_bug.cgi?id=58131 This bug should be updated, possibly resolved, with the work you've done here.
On Mon, Feb 13, 2017 at 7:51 AM, Greg Woolsey <[email protected]> wrote: > What I meant was external implementations of the Sheet interface. > Everything inside POI that I've done in my branch is fully implemented with > tests and JavaDocs. > > On Sun, Feb 12, 2017 at 7:32 PM Javen O'Neal <[email protected]> wrote: > > Anything that is still stubbed can be annotated with @NotImplemented, which > will add the corresponding note to the JavaDocs. > > On Feb 12, 2017 17:08, "Greg Woolsey" <[email protected]> wrote: > >> Thanks for taking the time to inspect it, and point me to the utilities, I >> hadn't found them yet. I'll incorporate them and run all the tests >> including my new ones. >> >> Are folks OK with this as part of 3.16? I'd like that. There are some > new >> methods on some interfaces, so if there are custom implementations of >> things like Sheet, they would need changes. Returning null or false > should >> be fine though, as the only callers are the new evaluator classes, and >> anyone using those would need the new API methods anyway. >> >> On Sun, Feb 12, 2017, 5:22 PM Javen O'Neal <[email protected]> wrote: >> >> > I haven't read through all your changes on GitHub yet, but all the >> > changes so far look good. I have a few other suggestions to >> > deduplicate some code using SheetUtil and FormulaShifter, but those >> > changes can be made at a later date if needed. >> > >> > You should be able to use git-svn to push your changes. Read through >> > and improve our git documentation [1] if necessary. >> > >> > [1] https://poi.apache.org/guidelines.html#Approach+3+-+the+git+way >> > >> > On Fri, Feb 10, 2017 at 12:10 AM, Greg Woolsey <[email protected]> >> > wrote: >> > > Well, I couldn't stand the incomplete support, so now this supports >> > > evaluating rules for all the different types, including range >> aggregates >> > > like "greater than 2 standard deviations" and "top 10". Still doesn't >> > > provide help assigning partitioning buckets for icon sets and colors, >> but >> > > everything else is working. >> > > >> > > I filed a big bug with Vaadin, listing 5 core design problems I've >> found >> > > with their Conditional Formatting implementation, and offering my >> > > replacement for their code that uses the new POI evaluator instead. >> They >> > > bit, and are interested, but I won't make my first commit some > behemoth >> > > that hasn't received any feedback. I know there are conventions and >> > ideas >> > > I've missed :) >> > > >> > > I need both sets of changes for my day job, so I'm all-in on doing it >> > > right in both directions and facilitating the conversations. >> > > >> > > On Tue, Feb 7, 2017 at 11:50 PM Greg Woolsey <[email protected]> >> > wrote: >> > > >> > >> Now this fork also contains ConditionalSpreadsheetEvaluator, and >> related >> > >> code. The unit test is essentially a stub, but tests one basic style >> > for >> > >> proof of concept. >> > >> >> > >> I've actually implemented a version of Vaadin Spreadsheet that uses >> this >> > >> new code to see how it performs, and I'm quite happy with both the >> > improved >> > >> performance (~50% faster than theirs) and feature coverage/accuracy. >> > I've >> > >> found 5 major bugs so far in what they did, most likely the result of >> > the >> > >> complexity of the document structure and the fact that several key >> > pieces >> > >> of information where still buried in the implementation classes, and >> > hadn't >> > >> been surfaced yet to the SS interfaces. I've done that in this > branch >> > also. >> > >> >> > >> My code here is my own, I didn't like anything I saw elsewhere enough >> to >> > >> copy it :) >> > >> >> > >> Evaluation currently doesn't support range-based conditions, such as >> > >> TOP_10, DUPLICATE, etc. Those don't seem like they'd be that bad to >> > do, if >> > >> someone wants to take a stab at them. I don't need them (yet), so >> they >> > >> just evaluate to "false" with a TODO comment for now. >> > >> >> > >> Likewise, there is no code to report which partition bucket a cell >> falls >> > >> into when the condition type is one of the partitioned styles, 2,3 or >> 4 >> > >> value buckets, gradient fill, etc. The fact that the rule matches >> > (based >> > >> on range) is available, the caller would need to evaluate the rule >> type >> > and >> > >> see what lies beneath. >> > >> >> > >> I assume interested parties will take a look as they have time and >> > >> inclination. I'm sure there are areas to discuss, beyond where to > put >> > the >> > >> curly braces :) I left some comments as to alternate strategies for >> > some >> > >> areas, where I opted for less change to existing classes as a > starting >> > >> point, even if it means a switch...case here or there when a new >> method >> > >> could be added to an Enum class instead. >> > >> >> > >> Hopefully the new methods on the SS interfaces are deemed minor - the >> > >> values were already there in most cases, at least on one side or the >> > other >> > >> (HSSF/XSSF), with a static default to use for the other one per MS >> > >> documentation. >> > >> >> > >> Greg >> > >> >> > >> On Wed, Feb 1, 2017 at 5:38 PM Greg Woolsey <[email protected]> >> > >> wrote: >> > >> >> > >> Oh, the primary class is o.a.p.ss.formula.DataValidationEvaluator >> > >> >> > >> On Wed, Feb 1, 2017 at 5:37 PM Greg Woolsey <[email protected]> >> > >> wrote: >> > >> >> > >> My GitHub branch now contains Data Validation code and unit tests. >> The >> > >> test file DataValidationEvaluations.xlsx contains a large set of >> > validation >> > >> examples, including one formula example that applies to a range of >> cells >> > >> and uses a relative formula. The evaluation code has corresponding >> > logic >> > >> to offset the relative formula Ptgs from the top left of the region. >> > >> >> > >> Every test is labeled in the file with column A as a description, >> > column B >> > >> as the cell with validation, and column C the expected result, TRUE = >> > >> valid, FALSE = invalid. >> > >> >> > >> The unit test compares the POI validation result with the expected >> > column, >> > >> failing on boolean mismatches. >> > >> >> > >> Have not had time to run all tests yet, but this should only be code >> > >> additions, not modifications. I'll run them soon. >> > >> >> > >> I'm sure there are code style discussions to be had - for example I >> > >> implemented some things as inner classes for now, but we may want > them >> > >> top-level instead. >> > >> >> > >> Comments welcome, this is early code but is built on top of the SS >> > >> interfaces, so should be stable for HSSF and XSSF. >> > >> >> > >> Greg >> > >> >> > >> On Mon, Jan 30, 2017 at 9:55 AM Greg Woolsey <[email protected]> >> > >> wrote: >> > >> >> > >> Also, I just found this sample workbook >> > >> < >> > http://download.microsoft.com/download/1/6/F/16F701E9-63BA- >> 48D3-8B48-096F9288F443/AF010235700_en-us_cfsamples_af010235700.xlsx> >> > in >> > >> the Excel online support docs. If I have time to turn that into a >> unit >> > >> test, it's about as complete as we could want. Some parts are lost >> > saving >> > >> as HSSF, but we can then test that we evaluate what remains the same >> > way as >> > >> newer Excel when opening a legacy formatted file. >> > >> >> > >> On Mon, Jan 30, 2017 at 9:38 AM Greg Woolsey <[email protected]> >> > >> wrote: >> > >> >> > >> Thanks, that makes sense wrt custom implementations of >> FormulaEvaluator >> > - >> > >> I hadn't thought about anyone rolling their own, but it's an >> interface, >> > so >> > >> quite possible. Too bad we can't require Java 8 yet and use default >> > >> methods. >> > >> >> > >> I can work with the new *Evaluator class idea. And the HSSF >> limitations >> > >> will just mean more unit tests :) I have Excel 2016 available so I >> can >> > >> create test workbooks, save them as both XLSX and XLS, and compare >> > >> evaluations. I can then write unit tests based on them that expect >> the >> > >> results seen in Excel. That should give us reference points for >> > confidence >> > >> in our replication of their logic, especially around rule >> priority/order >> > >> and XLS HSSF files. >> > >> >> > >> On Fri, Jan 27, 2017 at 11:05 PM Nick Burch <[email protected]> >> > wrote: >> > >> >> > >> On Sat, 28 Jan 2017, Greg Woolsey wrote: >> > >> > As noted in one of the method JavaDocs, we also need to expose and >> > make >> > >> use >> > >> > of the ConditionalFormattingRule "priority" attribute. That's key >> to >> > >> > matching the right rule when more than one rule applies to a cell. >> > Only >> > >> > the first match in priority order is applied. >> > >> >> > >> Your slight challenge is that not all Conditional Formatting rules >> have >> > a >> > >> priority... XLSX ones do, and newer XLS ones based on CFRule12Record >> > (sid >> > >> = 0x087A) do, but the older XLS ones (CFRuleRecord / 0x01B1) don't. >> I'm >> > >> not sure what Excel does for those, but my hunch (based on our API) > is >> > >> that it uses their order as a priority. >> > >> >> > >> >> > >> > I've created a fork in GitHub for this, and committed a stab at >> > >> > high-level API methods that could be added to the FormulaEvaluator >> > >> > interface: >> > >> > >> > >> >> > https://github.com/WoozyG/poi/commit/d44fee7bd03ed450af589467ec90e2 >> 581b9f2b16$ >> > >> >> > >> FormulaEvaluator is an interface, which we have 4 implementations of >> in >> > >> our codebabse, and I'd guess that other complex users of POI will > have >> > >> dozens more. I'm not sure, therefore, that we want to be putting all >> of >> > >> the CF and DV logic methods on there, especially as it'll be common > to >> > all >> > >> implementations >> > >> >> > >> The HSSF classes for CF all use org.apache.poi.ss.formula.Formula >> which >> > is >> > >> PTG based. The HSSF classes for DV seem to store the raw PTGs. >> > >> >> > >> If we added two new SS usermodel classes, eg >> > >> ConditionalFormattingEvaluator and DataValidationEvaluator, these >> could >> > be >> > >> classes (not interfaces) with your proposed new methods on. They > could >> > >> hold the logic (once) for all formats (as it's basically the same on >> > all) >> > >> for priority, checking etc >> > >> >> > >> Doing that would also mean that "our" new classes could call out to >> our >> > >> existing low-level ones to evaluate formulas. That would mean we >> > wouldn't >> > >> have to make a breaking change to the FormulaEvaluator interface too >> > >> >> > >> Might that work for you? >> > >> >> > >> > No implementations have been done yet, and the Vaadin comments >> > indicate >> > >> > HSSF doesn't parse conditional formatting properly or something, > and >> > >> can't >> > >> > be evaluated correctly currently. I don't know exactly what they >> > found >> > >> > wrong, and it's rather annoying they didn't file any bugs. >> > >> >> > >> I think that comment is out of date, from before the CF work in 3.13 >> > >> >> > >> Nick >> > >> >> > >> --------------------------------------------------------------------- >> > >> To unsubscribe, e-mail: [email protected] >> > >> For additional commands, e-mail: [email protected] >> > >> >> > >> >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: [email protected] >> > For additional commands, e-mail: [email protected] >> > >> > >> --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
