Re: [ViennaCL-devel] Initializing matrices via direct serialization (row major or CCS)

Dmitriy Lyubimov Wed, 13 Jul 2016 22:04:12 -0700

ok, sorry, please scratch the last question, it was the mapping fault, not
passing in a proper context to the wrapper.


On Wed, Jul 13, 2016 at 3:37 PM, Dmitriy Lyubimov <[email protected]> wrote:

> Also: when i am using a wrapping constructor to initilaize a MAIN_MEMORY
> matrix around preexisting row-major buffer, when i subsequently try to use
> this matrix, i get the message:
>
> ViennaCL: Internal memory error: not initialised!
>
> why?
>
>
> On Wed, Jul 13, 2016 at 2:01 PM, Dmitriy Lyubimov <[email protected]>
> wrote:
>
>> So fast_copy still copies the memory and has copying overhead, even with
>> MAIN_MEMORY context?
>>
>> Is there a way to do shallow copying  (i.e. just pointer initialization)
>> to the matrix data buffer? Isn't it what some constructors of matrix or
>> matrix_base do?
>>
>> What i am getting at, it looks like i am getting a significant overhead
>> for just copying -- actually, it seems i am getting double overhead -- once
>> when i prepare padding and all as required by the internal_size?(), and
>> then i pass it into the fast_copy() which apparently does copying again,
>> even if we are using host memory matrices.
>>
>> all in all, by my estimates this copying back and forth (which is,
>> granted, is not greatly optimized on our side) takes ~15..17 seconds out of
>> 60 seconds total when multiplying 10k x 10k dense arguments via ViennaCL. I
>> also optimize to -march=haswell  and use -ffast-math, without those i seem
>> to fall too far behind what R + openblas can do in this test. Then, my
>> processing time swells up to 2 minutes without optimizing for non-compliant
>> arithmetics.
>>
>> If i can wrap the buffer and avoid copying for MAIN_MEMORY context, i'd
>> be shaving off another 10% or so of the execution time. Which would make me
>> happier, as i probably would be able to beat openblas given custom cpu
>> architecture flags.
>>
>> On the other hand, bidmat (which allegedly uses mkl) does the same test,
>> double precision, in under 10 seconds. I can't fathom how, but it does. I
>> have a haswell-E platform.
>>
>> thank you.
>> dmitriy
>>
>> On Tue, Jul 12, 2016 at 9:27 AM, Karl Rupp <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> > One question: you mentioned padding for the `matrix` type. When i
>>>
>>>> initialize the `matrix` instance, i only specify dimensions. how do I
>>>> know padding values?
>>>>
>>>
>>> if you want to provide your own padded dimensions, consider using
>>> matrix_base directly. If you want to query the padded dimensions, use
>>> internal_size1() and internal_size2() for the internal number of rows and
>>> columns.
>>>
>>> http://viennacl.sourceforge.net/doc/manual-types.html#manual-types-matrix
>>>
>>> Best regards,
>>> Karli
>>>
>>>
>>>
>>>
>>>> On Tue, Jul 12, 2016 at 5:53 AM, Karl Rupp <[email protected]
>>>> <mailto:[email protected]>> wrote:
>>>>
>>>>     Hi Dmitriy,
>>>>
>>>>     On 07/12/2016 07:17 AM, Dmitriy Lyubimov wrote:
>>>>
>>>>         Hi,
>>>>
>>>>         I am trying to create some elementary wrappers for VCL in
>>>> javacpp.
>>>>
>>>>         Everything goes fine, except i really would rather not use those
>>>>         "cpu"
>>>>         types (std::map,
>>>>         std::vector) and rather initialize matrices directly by feeding
>>>>         row-major or CCS formats.
>>>>
>>>>         I see that matrix () constructor accepts this form of
>>>>         initialization;
>>>>         but it really states that
>>>>         it does "wrapping" for the device memory.
>>>>
>>>>
>>>>     Yes, the constructors either create their own memory buffer
>>>>     (zero-initialized) or wrap an existing buffer. These are the only
>>>>     reasonable options.
>>>>
>>>>
>>>>         Now, i can create a host matrix() using host memory and
>>>> row-major
>>>>         packing. This works ok it seems.
>>>>
>>>>         However, these are still host instances. Can i copy host
>>>>         instances to
>>>>         instances on opencl context?
>>>>
>>>>
>>>>     Did you look at viennacl::copy() or viennacl::fast_copy()?
>>>>
>>>>
>>>>         That might be one way bypassing unnecessary (in my case)
>>>>         complexities of
>>>>         working with std::vector and std::map classes from java side.
>>>>
>>>>         But it looks like there's no copy() variation that would accept
>>>> a
>>>>         matrix-on-host and matrix-on-opencl arguments (or rather, it of
>>>>         course
>>>>         declares those to be ambiguous since two methods fit).
>>>>
>>>>
>>>>     If you want to copy your OpenCL data into a viennacl::matrix, you
>>>>     may wrap the memory handle (obtained with .elements()) into a vector
>>>>     and copy that. If you have plain host data, use
>>>>     viennacl::fast_copy() and mind the data layout (padding of
>>>>     rows/columns!)
>>>>
>>>>
>>>>         For compressed_matrix, there seems to be a set() method, but i
>>>> guess
>>>>         this also requires CCS arrays in the device memory if I use it.
>>>> Same
>>>>         question, is there a way to send-and-wrap CCS arrays to an
>>>>         opencl device
>>>>         instance of compressed matrix without using std::map?
>>>>
>>>>
>>>>     Currently you have to use .set() if you want to bypass
>>>>     viennacl::copy() and std::map.
>>>>
>>>>     I acknowledge that the C++ type system is a pain when interfacing
>>>>     from other languages. We will make this much more convenient in
>>>>     ViennaCL 2.0. The existing interface in ViennaCL 1.x is too hard to
>>>>     fix without breaking lots of user code, so we won't invest time in
>>>>     that (contributions welcome, though :-) )
>>>>
>>>>     Best regards,
>>>>     Karli
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev

_______________________________________________
ViennaCL-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Re: [ViennaCL-devel] Initializing matrices via direct serialization (row major or CCS)

Reply via email to