The following is from: http://geode-docs.cfapps.io/docs/developing/data_serialization/data_serialization_options.html
Geode serialization (either PDX Serialization or Data Serialization) does > not support circular object graphs whereas Java serialization does. In > Geode serialization, if the same object is referenced more than once in > an object graph, the object is serialized for each reference, and > deserialization *produces multiple copies* of the object. By contrast in > this situation, Java serialization serializes the object once and when > deserializing the object, it produces one instance of the object with > multiple references. So even if your graphs do not have cycles you may get duplication if nodes in the graph are referenced more than once. Keep in mind that each value stored in a geode region is serialized as a single BLOB and transmitted over the network. If your large arrays or graphs are something you will be modifying and you only need to change a relatively small part of them look into the delta propagation feature: http://geode.docs.pivotal.io/docs/developing/delta_propagation/chapter_overview.html For very large objects I would think you might want to keep all the access to the data on the server instead of transmitting the large object back to the client. So you might be planning to do this with functions that access that large arrays and graphs on the server, compute some result on the server, and then just send back that result to your client. In this case you would want to keep the data deserialized on the server so you can quickly access your data without needing to deserialize it. One of the features of PDX is that it allows you to access the fields of an object without needing that class on the server and without needing to deserialize that data. But I don't think you need this feature of PDX. Your data will initially be stored in serialized form in the region but once you access a region value on the server (for example from a function or a cache listener) it will be kept from then on in deserialized form on the server. On Tue, Jan 26, 2016 at 6:03 PM, Joseph Winston <[email protected]> wrote: > I am looking for a document or hints on best practices when using PDX. > The two specific use cases that Iām interested in understanding are: > 1. Large arrays ā Currently these data types are kept in a shared memory > segments that are organized using the most common access pattern (For > example: z fastest, then y, then x). When using PDX, should a single large > array that normally is on the order of 100s of GB be broken into smaller > objects, say z slices to help with loading the data? Are there better ways > to use PDX for these 3D and higher dimension arrays? > 2. Graphs ā One common data type is a directed acyclic graph, specifically > a scene graph, that holds graphical representations of business objects. > What is the best way to use PDX for large graphs? > > Thanks > >
