Hi Andrea, Question 1) When I call the aggregate, I would like to pass sample_size > with a sub-query, e.g. > ==> "SELECT bloomfilter_uda(name, (SELECT count(*) FROM test_table)) > FROM test_table;" <== > Is that possible with Cassandra?
Sub queries are not supported by Cassandra. Question 2) When I try to register the bloomfilter_uda, I get the > following error: > ==> InvalidRequest: Error from server: code=2200 [Invalid query] > message="Invalid set literal for (dummy) of type bloomfilter_udt" <== > Can I just pass Cassandra data types as a state (map, list, set)? Sorry, this question is not clear to me. Do you have a specific example? Question 3) If I assume, all of the above is my bad, how can I access > the props of the state? Like > ==> state.n_as_sample_size <== > Is this somehow possible? > You cannot access the state. You can only access the value returned by the aggregate so something like SELECT bloomfilter_uda(name, count) [...] Once you get the UDT value you can get the information you want from it based on which driver you are using (e.g. java: https://docs.datastax.com/en/developer/java-driver/3.7/manual/udts/). If you are only interested in one field you can directly get the field value using SELECT bloomfilter_uda(name, count). bloom_filter_as_map [...] Those questions are more suited for the user mailing list. The dev mailing list is for discussion on C* internal developments. On Sat, May 2, 2020 at 12:01 PM Andreas R. <andreasrimmelspac...@gmail.com> wrote: > Hello everybody, > > I am new to Cassandra and I am trying to extract sketches (e.g. bloom > filter) database side from some given data. I came across user defined > types/functions/aggregates and coded some simple implementation. But now > I am stuck, my questions break down to the below: > > 1. CREATE TYPE bloomfilter_udt ( > 2. n_as_sample_size int, > 3. m_as_number_of_buckets int, > 4. p_as_next_prime_above_m bigint, > 5. hash_for_string_coefficient_a list <bigint>, > 6. hash_for_number_coefficients_a list <bigint>, > 7. hash_for_number_coefficients_b list <bigint>, > 8. bloom_filter_as_map map<int,int> > 9. ); > 10. > 11. CREATE OR REPLACE FUNCTION bloomfilter_udf ( > 12. state bloomfilter_udt, > 13. valuetext, > 14. sample_size int > 15. ) > 16. CALLED ON NULL INPUT > 17. RETURNS bloomfilter_udt > 18. LANGUAGE java AS > 19. $$ > 20. //fill state which is user defined type bloomfilter_udt with some data > 21. returnstate; > 22. $$ > 23. ; > 24. > 25. CREATE OR REPLACE AGGREGATE bloomfilter_uda ( > 26. text, > 27. int > 28. ) > 29. SFUNC bloomfilter_udf > 30. STYPE bloomfilter_udt > 31. INITCOND {} > > Question 1) When I call the aggregate, I would like to pass sample_size > with a sub-query, e.g. > ==> "SELECT bloomfilter_uda(name, (SELECT count(*) FROM test_table)) > FROM test_table;" <== > Is that possible with Cassandra? > > Question 2) When I try to register the bloomfilter_uda, I get the > following error: > ==> InvalidRequest: Error from server: code=2200 [Invalid query] > message="Invalid set literal for (dummy) of type bloomfilter_udt" <== > Can I just pass Cassandra data types as a state (map, list, set)? > > Question 3) If I assume, all of the above is my bad, how can I access > the props of the state? Like > ==> state.n_as_sample_size <== > Is this somehow possible? > > I'd appreciate some help/hints. > Thanks > Andreas >