I've never played with Pentaho, but then again I am not really in DB
warehousing. I have played with Cognos at a business analyst training
course, and I think Cognos' feature set is pretty complete (w/ cubes,
reports, trends, etc - and also SPSS plugins).

Chi - we can develop something like Platform Analytics for Grid Engine
by re-using ARCo's dbwriter, but we will need to re-adjust the DB
schemas and redefine them for data-warehousing. So I think the change
is like 3 -9 months of work (never done a data-warehousing project
from scratch so it is a bit hard to do the sizing...). But parsing
files is slow, so Platform Analytics uses a direct DB connection
(JDBC), which is a bit less modular according to some data-warehousing
textbooks, and I believe both qmaster (mbatchd for LSF) & the DB need
to be up with JDBC - so there's pros & cons with each design...

And with the BI front-end, the cluster admin doesn't need to write SQL
code to query the cluster accounting data. But we already have some
SQL samples maintained on gridengine.org by Chris (and others), and
very complex queries are not for normal setups. Even with LSF, I don't
think a lot of users need Platform Analytics - ARM uses it as the
company has multiple datacenters, and it is hard if the cluster admin
needs to do time matching to find out if there is load imbalance.

Rayson



On Tue, Oct 18, 2011 at 3:41 PM, William Bryce <[email protected]> wrote:
> You can get it....it is called 'Pentaho'  http://www.pentaho.com ....you will 
> have to build your own cubes, reports, for it though.
>
> Bill.
>
> On 2011-10-18, at 1:29 AM, Chi Chan wrote:
>
>> 寄件者: William Bryce <[email protected]>
>>
>>> Yup, exactly right.  That is why we created UniSight.
>>> It is an ETL engine and complete BI server based on Pentaho.
>>> So when you run your report query on 5 million jobs it doesn't take the 
>>> whole day.
>>
>> When are we going to get the source code of UniSight?
>>
>>
>> 寄件者: Rayson Ho <[email protected]>
>>> Another example, GPU integration. A lot of sites don't have CUDA or
>>> OpenCL applications. While Platform's GPU integration is still the
>>> best among all of the solutions provided by major batch systems
>>> (including the new one in Open Grid Scheduler), over 90% of the sites
>>> don't need it.
>>
>> I wanted to take a look at the GPU integration. Where's the OGS download 
>> link?
>>
>> --Chi
>>
>>
>>
>>
>>
>>
>> On 2011-10-17, at 2:36 PM, Rayson Ho wrote:
>>
>>> 2011/10/11 Chi Chan <[email protected]>:
>>>> However, if you look at the growth rate of Platform before 2000, Platform 
>>>> grew at least 50% per year. Of course
>>>> as companies are larger it is harder to grow, but another factor is that 
>>>> there are more Platform LSF competitors,
>>>> like SGE, Torque/Maui, Condor, SLURM, etc, and they have similar 
>>>> functionalities but are much cheaper.
>>>
>>>
>>> It has been like that for a few years - nowadays people just switch
>>> from SGE to PBS, LSF to SGE, or to SLURM or Condor, etc... without
>>> worrying much about the features (or lack thereof) in any batch
>>> systems.
>>>
>>> A lot of the distinguishing features in LSF are not used by over 80%
>>> of the users. But people who need those features are willing to pay
>>> the expensive licensing cost!
>>>
>>> For example, Platform has integration for parallel environment for
>>> (almost) each supercomputer platform. So for instance, if you don't
>>> have SGI machines (which over 95% of the sites don't), then you won't
>>> need the SGI MPT integration.
>>>
>>> Another example, GPU integration. A lot of sites don't have CUDA or
>>> OpenCL applications. While Platform's GPU integration is still the
>>> best among all of the solutions provided by major batch systems
>>> (including the new one in Open Grid Scheduler), over 90% of the sites
>>> don't need it.
>>>
>>> And finally, Platform Analytics does reporting properly! I attended a
>>> business analyst workshop a few years ago, and after knowing how
>>> companies are handling their data for analysis, I immediately realized
>>> that the ARCo way was not efficient. While both ARCo & Platform
>>> Analytics store job & cluster information to a database, the way data
>>> is archived & processed makes a huge difference in load of the
>>> database, and the type of queries that can be easily issued by the
>>> cluster administrator. And I used to avoid mentioning this but as
>>> Platform will be owned by IBM it does not matter now - luckily, the
>>> work needed to fix ARCo is not extremely huge, in the end, Platform
>>> Analytics uses a 3rd party front-end, so ARCo can be re-architected to
>>> do very similar things relatively easily.
>>>
>>> Rayson
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users
>>
>> William Bryce | VP of Products
>> Univa Corporation - 1001 Warrenville Road, Suite 100 Lisle, Il, 65032 USA
>> Email [email protected] | Mobile: 512.751.8014 | Office: 416.519.2934
>
> William Bryce | VP of Products
> Univa Corporation - 1001 Warrenville Road, Suite 100 Lisle, Il, 65032 USA
> Email [email protected] | Mobile: 512.751.8014 | Office: 416.519.2934
>
>

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to