I was able to boil my stuff down quite a bit by focusing on the 
troublemaker portion. See attached unit test.  Using h2-latest.jar, I seem 
to get the IllegalStateException ("negative position exception") after 2-5 
minutes of letting this run.  Using h2-1.3.173-release, after 5-6 minutes I 
see the classic signs of a memory leak in jvisualvm (I didn't wait long 
enough for a OutOfMemoryError).

I've tinkered with a bunch of different aspects of this:
- writeBufferSize()
- w/ and wo/ mvstore.store() calls every 25k records.
- various cache sizes
- store-per-map instead of storing all maps in 1 store/file.
- Tried different data types in mrrels MVMap (string[], List<String>, 
Map<String,String>)

I realize this test may look a little strange, but there is a lot more 
going on in my real tool and this is representative how it works and the 
strange data I'm working with. 

It seems like my issue is in the second loop, where I'm interspersing lots 
of .get()'s from 1 map and .put()'s to another.

Thanks for your help,
Brian

On Wednesday, October 9, 2013 5:05:38 PM UTC-6, Thomas Mueller wrote:
>
> Hi,
>
> I'm sorry I can't say what the problem is. It kind of looks like it's 
> trying to read from a chunk while the chunk is not yet written. But I would 
> need to know what your are doing exactly.
>
> - Did you start with a new file?
> - How large is the file when the problem occurs?
> - How do you open the MVStore?
> - How to you process the data (concurrently, when do you call commit, do 
> you call save yourself and if yes when,...)? 
> - How large is a typical record?
> - What data types do you use?
>
> Regards,
> Thomas
>
>
>
> On Wed, Oct 9, 2013 at 11:41 PM, Brian Bray <[email protected]<javascript:>
> > wrote:
>
>> Thanks Thomas!  So I switched to h2-latest.jar and it fails sooner with 
>> an IllegalStateException:
>>
>> Caused by: java.lang.IllegalStateException: Negative position 
>> -9223372036854646992 [1.3.173/6]
>>         at 
>> org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:709)
>>         at org.h2.mvstore.MVStore.readPage(MVStore.java:1439)
>>         at org.h2.mvstore.MVMap.readPage(MVMap.java:759)
>>         at org.h2.mvstore.Page.getChildPage(Page.java:207)
>>         at org.h2.mvstore.MVMap.binarySearch(MVMap.java:449)
>>         at org.h2.mvstore.MVMap.binarySearch(MVMap.java:450)
>>         at org.h2.mvstore.MVMap.get(MVMap.java:431)
>>
>> Here is whats odd though, I basically have 3 large CSV files I'm parsing 
>> right now (#1=500MB/4M rows, #2=2.2GB/26M rows, #3=1.3GB/15M rows), it 
>> makes it through the first 2 fine, then I get the above exception about 
>> 100k rows into file #3. 
>>
>> File #3 was also the only file failing with the previously mentioned 
>> BufferOverflowException in 1.3.173-release (it also seemed to have a slow 
>> memory leak), So I'm wondering if I have some weird data in file #3 that is 
>> causing issues?  Again, it might be difficult to extract a unit test out of 
>> this, but focusing on replicating file #3 might make it a bit easier.
>>
>> Brian
>>
>> On Wednesday, October 9, 2013 1:00:32 PM UTC-6, Thomas Mueller wrote:
>>
>>> Hi,
>>>
>>> We fixed quite a number of bugs in the MVStore recently. I suggest to 
>>> try again with the nightly build at http://h2database.com/html/**
>>> build.html#automated <http://h2database.com/html/build.html#automated>- 
>>> direct link 
>>> http://www.h2database.**com/automated/h2-latest.jar<http://www.h2database.com/automated/h2-latest.jar>-
>>>  or of course you could build it yourself.
>>>
>>> > I'm currently using MVStore as a large, temporary disk cache
>>>
>>> Noel and me recently implemented an off-heap storage for the MVStore, so 
>>> it's using memory (not disk) outside the normal Java heap. This might be 
>>> interesting for you. The documentation is not online yet (as it's not 
>>> released yet), it is only available in source form yet at 
>>> https://h2database.**googlecode.com/svn/trunk/h2/**
>>> src/docsrc/html/mvstore.html<https://h2database.googlecode.com/svn/trunk/h2/src/docsrc/html/mvstore.html>(see
>>>  'off-heap'). To use it, call:
>>>
>>> OffHeapStore offHeap = new OffHeapStore();
>>> MVStore s = new MVStore.Builder().
>>>         fileStore(offHeap).open();
>>>
>>> I'm also thinking about combining the LIRS cache with the MVStore, so 
>>> that you could build an off-heap LIRS cache. It shouldn't be complicated to 
>>> implement (the cache would simply needs a map factory).
>>> Regards,
>>> Thomas
>>>
>>>
>>>
>>> On Wed, Oct 9, 2013 at 7:52 PM, Brian Bray <[email protected]> wrote:
>>>
>>>> Thomas,
>>>>
>>>> Here is a more elusive issue that I'm wondering if its a bug in 
>>>> MVMap/MVStore.  
>>>>
>>>> I'm currently using MVStore as a large, temporary disk cache, I have a 
>>>> java program that scans about 10GB of raw CSV-ish files, and for each 
>>>> file, 
>>>> plucks out a few fields I care about and stores it in a MVMap. At the end, 
>>>> I merge the data from several MVMaps into a small JSON document which 
>>>> represents all the data buried in the CSV files for a particular entity.  
>>>> I 
>>>> then store that JSON document in H2 for my app to use.
>>>>
>>>> Its been working fairly well (and fast!), but there is one weird issue 
>>>> I'm encountering, that throws the exception below after 6-8M rows have 
>>>> been 
>>>> processed. Its going to be tough to extrapolate a test case, I was 
>>>> wondering if you had any insight into this?  It seems to go away when I 
>>>> shorten some key sizes, but I don't know if I'm just delaying the problem 
>>>> and eventually this would still happen?
>>>>
>>>> I'm using 1.3.173, basically every 50k records or so I call 
>>>> MVstore.store() to flush to disk and eventually its throwing this:
>>>>
>>>> Caused by: java.nio.**BufferOverflowException
>>>>         at java.nio.HeapByteBuffer.put(**HeapByteBuffer.java:183)
>>>>         at java.nio.ByteBuffer.put(**ByteBuffer.java:832)
>>>>         at org.h2.mvstore.type.**ObjectDataType$**
>>>> SerializedObjectType.write(**ObjectDataTpe.java:1515)
>>>>          at org.h2.mvstore.type.**ObjectDataType.write(**
>>>> ObjectDataType.java:113)
>>>>         at org.h2.mvstore.Page.write(**Page.java:799)
>>>>         at org.h2.mvstore.Page.**writeUnsavedRecursive(Page.**java:860)
>>>>         at org.h2.mvstore.Page.**writeUnsavedRecursive(Page.**java:855)
>>>>         at org.h2.mvstore.Page.**writeUnsavedRecursive(Page.**java:855)
>>>>         at org.h2.mvstore.Page.**writeUnsavedRecursive(Page.**java:855)
>>>>         at org.h2.mvstore.MVStore.store(**MVStore.java:921)
>>>>         at org.h2.mvstore.MVStore.store(**MVStore.java:813)
>>>>
>>>> Thanks,
>>>> Brian
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "H2 Database" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to h2-database...@**googlegroups.com.
>>>> To post to this group, send email to [email protected].
>>>>
>>>> Visit this group at 
>>>> http://groups.google.com/**group/h2-database<http://groups.google.com/group/h2-database>
>>>> .
>>>> For more options, visit 
>>>> https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
>>>> .
>>>>
>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "H2 Database" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected]<javascript:>
>> .
>> Visit this group at http://groups.google.com/group/h2-database.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/groups/opt_out.

import java.io.File;
import java.util.HashMap;
import java.util.Map;

import org.h2.mvstore.MVMap;
import org.h2.mvstore.MVStore;
import org.junit.Test;

public class RRFBuilderTests {
	
	private static final int AUI2URN_COUNT=4_000_000;
	private static final int URN_COUNT=100_000;
	private static final int REL_COUNT=15_000_000;
	private static final int BLOCK_SIZE = 25_000;
	
	/** taking a stab at replicating my MVStore issues as a long-running unit test */
	@Test
	public void testRRFBuilder() {
		File file = new File(System.getProperty("java.io.tmpdir"), "/tmp.cache");
		file.delete(); // ensure an empty file
		MVStore mvstore = new MVStore.Builder().writeBufferSize(25).fileName(file.getAbsolutePath()).cacheSize(100).open();
		
		// first generate AUI to URN map to simulate MRCONSO.RRF data
		// by mapping n sequential AUI's to random urn's.
		MVMap<String,String> aui2urn = mvstore.openMap("aui2urn");
		for (int i=0; i < AUI2URN_COUNT; i++) {
			aui2urn.put(getAUI(i), getRandomURN());
			if (i % BLOCK_SIZE == 0) {
				System.out.printf("aui2urn's generated: %s/%s\n", i, AUI2URN_COUNT);
				mvstore.store();
			}
		}
		mvstore.store();
		
		// simulate relationship records from MRREL.RRF
		MVMap<String, Map<String,String>> mrrels = mvstore.openMap("mrrel");
		for (int i=0; i < AUI2URN_COUNT; i++) {
			// for each AUI, generate 5 random mappings
			for (int j=0; j < 5; j++) {
				Map<String,String> rec = buildRec(getAUI(i));
				String urn1 = aui2urn.get(rec.get("aui1"));
				String urn2 = aui2urn.get(rec.get("aui2"));
				Map<String,String> rels = mrrels.get(urn1);
				if (rels == null) rels = new HashMap<>();
				rels.put(urn2, rec.get("rel") + ":" + rec.get("rela"));
				mrrels.put(urn1, rels);
			}
			if (i % BLOCK_SIZE == 0) {
				System.out.printf("mrrels's generated: %s/%s\n", i, REL_COUNT);
				mvstore.store();
			}
		}
		mvstore.store();
	}
	
	/** Generates a random aui-to-aui relationship record */
	private static final Map<String,String> buildRec(String aui) {
		Map<String,String> rec = new HashMap<>();
		rec.put("aui1", aui);
		rec.put("aui2", getRandomAUI());
		rec.put("rel", "foo");
		rec.put("rela", "bar");
		return rec;
	}
	
	private static final String getAUI(long id) {
		return String.format("A%07d", id);
	}
	
	private static final String getRandomAUI() {
		return getAUI(Math.round(Math.random()*AUI2URN_COUNT));
	}
	
	private static final String getRandomURN() {
		return String.format("urn:foo:%07d", Math.round(Math.random()*URN_COUNT));
	}
}

Reply via email to