Sure that will be an easy enough thing to refactor later. There are a couple of sensible options. For the moment I want to get it functional and I'll ensure the memory doesn't go pop.
On 18 June 2015 at 20:54, Lars Helge Øverland <larshe...@gmail.com> wrote: > Hi okay, yes its maybe no ideal solution here. I think I would favor a > PipedOutputStream/PipedInputStream pair with a separate thread over an > in-memory DOM. > > Do we really need a separate threadpool? We fork off threads many places in > the system already, e.g. with parallel analytics queries. I thought as long > as its limited to one of a few per process it should be handled by the JVM. > But I might be wrong. > > > > > > On Thu, Jun 18, 2015 at 8:46 PM, Bob Jolliffe <bobjolli...@gmail.com> wrote: >> >> Hi Lars >> >> The problem is the dataValuSetService requires an an inputstream to >> feed off. There are only 2 ways to provide an inputstream that I can >> think of. Either create a pipe or buffer (eg with a string). >> >> Creating a pipe is doable but then you also need to create a separate >> thread to read it which is another resource to manage (eg with a pool) >> but that seemed like more effort than it is worth. >> >> What I can do short term as a defensive measure is to place a limit on >> the number of datavalues which can be buffered for a single >> datavalueset. That way it should not be possible to explode the >> memory. I'll do that soon. >> >> Note that in "normal" use this should not be a problem as a single adx >> group corresponds to the data for one orgunit, for one period - what >> is envisaged typically is a single dataset's worth. >> >> The other "alternative" is not to use the datavalueSetService at all >> but just duplicate the code. >> >> Bob >> >> On 18 June 2015 at 15:22, Lars Helge Øverland <larshe...@gmail.com> wrote: >> > Hi Bob, >> > >> > as you say this creates a hard limit on memory. Now all it will take to >> > bring down a DHIS 2 instance is now to submit a sufficiently large >> > import >> > file. Seems like this will provide head-aches for server admins ;) Can >> > we >> > find a stream-based solution which scales well? >> > >> > Lars >> > >> > >> > On Thu, Jun 18, 2015 at 2:49 PM, Bob Jolliffe <bobjolli...@gmail.com> >> > wrote: >> >> >> >> WIP committed and slight adjustment of strategy ... >> >> >> >> I was not comfortable with creating a new thread just to pipe from adx >> >> to >> >> dxf. >> >> >> >> So instead, for each adx group corresponding to a dataValueSet with >> >> orgUnit, period (and potentially atributeOptionCombo), I create a >> >> dataValueSet DOM document and present that to the dxf2 stream importer >> >> as a stream. Given that this data is bound by a single orgunit and >> >> period I don't think the DOM document is going to break the memory >> >> bank. >> >> >> >> Basic conversion to dxf2 is working fine. >> >> >> >> Next task is to "implode" the categories. >> >> >> >> A luta Continua. >> >> >> >> On 12 June 2015 at 13:40, Bob Jolliffe <bobjolli...@gmail.com> wrote: >> >> > Hi >> >> > >> >> > As yoou have seen I have already started to commit a few bits of code >> >> > in support of the ADX implementation. I hadn't been planning to do >> >> > this so will proceed quite slowly, but let me outline the approach I >> >> > am considering for your comment and suggestion. >> >> > >> >> > 1. Currently we have a datavaueset service which can import dxf2 >> >> > data >> >> > from an inputstream. >> >> > >> >> > 2. I would like to use that existing service and place the adx >> >> > service as a thin veneer above it rather than create a lot of >> >> > duplicated code. >> >> > >> >> > 3. The adx data importer would read its adx input from a stream and >> >> > convert that into a dxf2 stream. The main tasks it would need to >> >> > perform are: >> >> > (i) convert periods into dxf2 format >> >> > (ii) lookup catoptcombos and attributeoptioncombos for the dimensions >> >> > in the adx message >> >> > All other attributes and ImportOptions would be passed through >> >> > directly to the dxf2 datavalueset service. >> >> > >> >> > 4. In order to present the resulting dxf2 to the service as an >> >> > InputStream it would have to use PipeReader/PipeWriter combination >> >> > (Something Lars will recall from earlier dxf1 code). The equivalent >> >> > alternative would be to post the dxf2 datasets backout to the REST >> >> > endpoint but that seems wasteful and more awkward. >> >> > >> >> > Does that approach sound reasonable? >> >> > >> >> > I have some lingering uncertainty about the best way to deal with >> >> > ImportSummary. The adx data is naturally grouped by orgunit/period. >> >> > So I would likely split the stream and post each as a separate dxf2 >> >> > datavalueset. So probably this would imply collecting the results >> >> > into an <ImportSummaries ... /> element. ADX is currently silent on >> >> > the result message as it deliberately does not define the transaction >> >> > (just the message) so we have some latitude here to do whatever is >> >> > best. The above is my best suggestion. >> >> > >> >> > Cheers >> >> > Bob >> >> >> >> -- >> >> Mailing list: https://launchpad.net/~dhis2-devs-core >> >> Post to : dhis2-devs-core@lists.launchpad.net >> >> Unsubscribe : https://launchpad.net/~dhis2-devs-core >> >> More help : https://help.launchpad.net/ListHelp >> > >> > >> > >> > >> > -- >> > Lars Helge Øverland >> > Lead developer, DHIS 2 >> > University of Oslo >> > Skype: larshelgeoverland >> > http://www.dhis2.org >> > > > > > > -- > Lars Helge Øverland > Lead developer, DHIS 2 > University of Oslo > Skype: larshelgeoverland > http://www.dhis2.org > -- Mailing list: https://launchpad.net/~dhis2-devs-core Post to : dhis2-devs-core@lists.launchpad.net Unsubscribe : https://launchpad.net/~dhis2-devs-core More help : https://help.launchpad.net/ListHelp